- Spark.FlatMapIterator
- Spark.JavaPairRDD
- Spark.JavaRDD
- Spark.PipelinedPairRDD
- Spark.PipelinedPairRDD
- Spark.PipelinedRDD
- Spark.PipelinedRDD
- Spark.SparkContext
- Spark.SparkContext
- Base.close
- Base.collect
- Base.collect
- Base.count
- Base.map
- Base.reduce
- Spark.add_file
- Spark.add_jar
- Spark.cache
- Spark.cache
- Spark.cartesian
- Spark.chain_function
- Spark.coalesce
- Spark.collect_internal
- Spark.collect_internal_itr
- Spark.collect_itr
- Spark.collect_itr
- Spark.context
- Spark.create_flat_map_function
- Spark.create_map_function
- Spark.deserialized
- Spark.flat_map
- Spark.flat_map_pair
- Spark.group_by_key
- Spark.id
- Spark.map_pair
- Spark.map_partitions
- Spark.map_partitions_pair
- Spark.map_partitions_with_index
- Spark.num_partitions
- Spark.pipe
- Spark.readobj
- Spark.reduce_by_key
- Spark.repartition
- Spark.serialized
- Spark.share_variable
- Spark.text_file
- Spark.writeobj
Spark.JavaRDD — Type.Pure wrapper around JavaRDD
Spark.SparkContext — Type.Wrapper around JavaSparkContext
Spark.SparkContext — Method.Params:
- master - address of application master. Currently only local and standalone modes are supported. Default is 'local' 
- appname - name of application 
Base.close — Method.Close SparkContext
Base.collect — Method.Collect all elements of rdd on a driver machine
Base.collect — Method.Collect all elements of rdd on a driver machine
Base.count — Method.Count number of elements in this RDD
Base.map — Method.Apply function f to each element of rdd
Base.reduce — Method.Reduce elements of rdd using specified function f
Spark.cache — Method.Persist this RDD with the default storage level (MEMORY_ONLY)
Spark.cache — Method.Persist this RDD with the default storage level (MEMORY_ONLY)
Spark.cartesian — Method.Create a pair RDD with every combination of the values of rdd1 and rdd2
Spark.coalesce — Method.Return a new RDD that is reduced into num_partitions partitions.
Spark.flat_map — Method.Similar to map, but each input item can be mapped to 0 or more output items (so f should return an iterator rather than a single item)
Spark.flat_map_pair — Method.Similar to map, but each input item can be mapped to 0 or more output items (so f should return an iterator of pairs rather than a single item)
Spark.group_by_key — Method.When called on a dataset of (K, V) pairs, returns a dataset of (K, [V]) pairs.
Spark.id — Method.Return the id of the rdd
Spark.map_pair — Method.Apply function f to each element of rdd
Spark.map_partitions — Method.Apply function f to each partition of rdd. f should be of type (iterator) -> iterator
Spark.map_partitions_pair — Method.Apply function f to each partition of rdd. f should be of type (iterator) -> iterator
Spark.map_partitions_with_index — Method.Apply function f to each partition of rdd. f should be of type (index, iterator) -> iterator
Spark.num_partitions — Method.Returns the number of partitions of this RDD.
Spark.pipe — Method.Return an RDD created by piping elements to a forked external process.
Spark.reduce_by_key — Method.When called on a dataset of (K, V) pairs, returns a dataset of (K, V) pairs where the values for each key are aggregated using the given reduce function func, which must be of type (V,V) => V.
Spark.repartition — Method.Return a new RDD that has exactly num_partitions partitions.
Spark.share_variable — Method.Makes the value of data available on workers as symbol name
Spark.text_file — Method.Create RDD from a text file
Spark.FlatMapIterator — Type.Iterates over the iterators within an iterator
Spark.JavaPairRDD — Type.Pure wrapper around JavaPairRDD
Spark.PipelinedPairRDD — Type.Julia type to handle Pair RDDs. Can handle pipelining of operations to reduce interprocess IO.
Spark.PipelinedPairRDD — Method.Params:
- parentrdd - parent RDD 
- func - function of type - (index, iterator) -> iteratorto apply to each partition
Spark.PipelinedRDD — Type.Julia type to handle RDDs. Can handle pipelining of operations to reduce interprocess IO.
Spark.PipelinedRDD — Method.Params:
- parentrdd - parent RDD 
- func - function of type - (index, iterator) -> iteratorto apply to each partition
Spark.add_file — Method.Add file to SparkContext. This file will be downloaded to each executor's work directory
Spark.add_jar — Method.Add JAR file to SparkContext. Classes from this JAR will then be available to all tasks
Spark.chain_function — Method.chain 2 partion functions together
Spark.collect_internal — Method.Collects the RDD to the Julia process, by serialising all values via a byte array
Spark.collect_internal_itr — Method.Collects the RDD to the Julia process, via an Julia iterator that fetches each row at a time. This prevents creation of a byte array containing all rows at a time.
Spark.collect_itr — Method.Collect all elements of rdd on a driver machine
Spark.collect_itr — Method.Collect all elements of rdd on a driver machine
Spark.context — Method.Get SparkContext of this RDD
Spark.create_flat_map_function — Method.creates a function that operates on a partition from an element by element flat_map function
Spark.create_map_function — Method.creates a function that operates on a partition from an element by element map function
Spark.deserialized — Method.Return object deserialized from array of bytes
Spark.readobj — Method.Read data object from a ioet. Returns code and byte array:
- if code is negative, it's considered as a special command code 
- if code is positive, it's considered as array length 
Spark.serialized — Method.Return serialized object as an array of bytes
Spark.writeobj — Method.Write object to stream