{"category":"57c0a538d796900e0024ffe5","project":"568f126dbdb9260d00149d97","user":"54d84de525e90a0d00db552a","version":"57c0a538d796900e0024ffe4","updates":[],"_id":"57c0a539d796900e0024fffa","createdAt":"2016-01-22T00:05:55.399Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":0,"body":"Apache Ignite provides seamless integrations with Hadoop and Spark. While the Ignite-Hadoop integration allows you to use Ignite File System as a primary caching layer to store HDFS data, Ignite-Spark integration allows you to share state in-memory across multiple spark jobs using an implementation of Spark RDD.\n\n##Ignite for Spark\nApache Ignite provides an implementation of Spark RDD abstraction which allows to easily share state in memory across Spark jobs. The main difference between native Spark RDD and `IgniteRDD` is that Ignite RDD provides a shared in-memory view on data across different Spark jobs, workers, or applications, while native Spark RDD cannot be seen by other Spark jobs or applications. \n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/yBlxMlVLSCeTmFVlAEtH_spark-ignite-rdd-small.png\",\n        \"spark-ignite-rdd-small.png\",\n        \"300\",\n        \"247\",\n        \"#f44945\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]\n[Read more](doc:ignite-for-spark)\n\n##In-Memory File System\nOne of unique capabilities of Ignite is a distributed in-memory file system called Ignite File System (IGFS). IGFS delivers similar functionality to Hadoop HDFS, but only in memory. In fact, in addition to its own APIs, IGFS implements Hadoop FileSystem API and can be transparently plugged into Hadoop or Spark deployments. \n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/BzYteFQ0RGufwlMclb6z_spark-igfs-small.png\",\n        \"spark-igfs-small.png\",\n        \"300\",\n        \"281\",\n        \"#f14945\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]\n[Read More](doc:in-memory-file-system)\n\n##In-Memory Map Reduce\nIgnite In-Memory MapReduce allows to effectively parallelize the processing data stored in any Hadoop file system. It eliminates the overhead associated with job tracker and task trackers in a standard Hadoop architecture while providing low-latency, HPC-style distributed processing. \n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/Nb8BONhmSz6rSg7XQKoJ_ignite_mapreduce-small.png\",\n        \"ignite_mapreduce-small.png\",\n        \"300\",\n        \"197\",\n        \"#ea9147\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]\n[Read More](doc:map-reduce)\n\n##Hadoop Accelerator\nApache Ignite Hadoop Accelerator provides a set of components allowing for in-memory Hadoop job execution and file system operations. It can be used in combination with Ignite File System and In-Memory MapReduce, and can be easily plugged in to any Hadoop distribution.\n\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/M1C9lqgbQLCT2qSQG4Vc_ignite_filesystem-small.png\",\n        \"ignite_filesystem-small.png\",\n        \"300\",\n        \"207\",\n        \"#f88e34\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]\n[Read more](doc:hadoop-accelerator)","excerpt":"","slug":"overview","type":"basic","title":"Overview","__v":0,"childrenPages":[]}
Apache Ignite provides seamless integrations with Hadoop and Spark. While the Ignite-Hadoop integration allows you to use Ignite File System as a primary caching layer to store HDFS data, Ignite-Spark integration allows you to share state in-memory across multiple spark jobs using an implementation of Spark RDD. ##Ignite for Spark Apache Ignite provides an implementation of Spark RDD abstraction which allows to easily share state in memory across Spark jobs. The main difference between native Spark RDD and `IgniteRDD` is that Ignite RDD provides a shared in-memory view on data across different Spark jobs, workers, or applications, while native Spark RDD cannot be seen by other Spark jobs or applications. [block:image] { "images": [ { "image": [ "https://files.readme.io/yBlxMlVLSCeTmFVlAEtH_spark-ignite-rdd-small.png", "spark-ignite-rdd-small.png", "300", "247", "#f44945", "" ] } ] } [/block] [Read more](doc:ignite-for-spark) ##In-Memory File System One of unique capabilities of Ignite is a distributed in-memory file system called Ignite File System (IGFS). IGFS delivers similar functionality to Hadoop HDFS, but only in memory. In fact, in addition to its own APIs, IGFS implements Hadoop FileSystem API and can be transparently plugged into Hadoop or Spark deployments. [block:image] { "images": [ { "image": [ "https://files.readme.io/BzYteFQ0RGufwlMclb6z_spark-igfs-small.png", "spark-igfs-small.png", "300", "281", "#f14945", "" ] } ] } [/block] [Read More](doc:in-memory-file-system) ##In-Memory Map Reduce Ignite In-Memory MapReduce allows to effectively parallelize the processing data stored in any Hadoop file system. It eliminates the overhead associated with job tracker and task trackers in a standard Hadoop architecture while providing low-latency, HPC-style distributed processing. [block:image] { "images": [ { "image": [ "https://files.readme.io/Nb8BONhmSz6rSg7XQKoJ_ignite_mapreduce-small.png", "ignite_mapreduce-small.png", "300", "197", "#ea9147", "" ] } ] } [/block] [Read More](doc:map-reduce) ##Hadoop Accelerator Apache Ignite Hadoop Accelerator provides a set of components allowing for in-memory Hadoop job execution and file system operations. It can be used in combination with Ignite File System and In-Memory MapReduce, and can be easily plugged in to any Hadoop distribution. [block:image] { "images": [ { "image": [ "https://files.readme.io/M1C9lqgbQLCT2qSQG4Vc_ignite_filesystem-small.png", "ignite_filesystem-small.png", "300", "207", "#f88e34", "" ] } ] } [/block] [Read more](doc:hadoop-accelerator)