sdf_explode.Rd
Exploding an array column of length N
will replicate the top level record N
times.
The ith replicated record will contain a struct (not an array) corresponding to the ith element
of the exploded array. Exploding will not promote any fields or otherwise change the schema of
the data.
sdf_explode(x, column, is_map = FALSE, keep_all = FALSE)
x | An object (usually a |
---|---|
column | The field to explode |
is_map | Logical. The (scala) |
keep_all | Logical. If |
Two types of exploding are possible. The default method calls the scala explode
method.
This operation is supported in both Spark version > 1.6. It will however drop records where the
exploding field is empty/null. Alternatively keep_all=TRUE
will use the explode_outer
scala method introduced in spark 2 to not drop any records.
# NOT RUN { # first get some nested data iris2 <- copy_to(sc, iris, name="iris") iris_nst <- iris2 %>% sdf_nest(Sepal_Length, Sepal_Width, Petal_Length, Petal_Width, .key="data") %>% group_by(Species) %>% summarize(data=collect_list(data)) # then explode it iris_nst %>% sdf_explode(data) # }