@@ -3084,6 +3084,258 @@ g.V().has('airport','code',within('IAD','MIA','LAX')).
3084
3084
[m]
3085
3085
----
3086
3086
3087
+ [[collectionsteps]]
3088
+ Working with collections
3089
+ ~~~~~~~~~~~~~~~~~~~~~~~~
3090
+
3091
+ Collections are containers for other values and refer to things like 'List' , 'Set'
3092
+ and 'Map' . Gremlin offers a variety of steps that help construct and manipulate these
3093
+ objects. We've already seen how we can use a step like 'fold' to create a 'List' of
3094
+ the objects in the traversal stream:
3095
+
3096
+ [source, groovy]
3097
+ ----
3098
+ g.V().has('region','GB-ENG').values('runways').dedup().fold()
3099
+
3100
+ [2,1,3,4]
3101
+ ----
3102
+
3103
+ As for maps, Gremlin can produce them whenever you use a step like 'group' or
3104
+ 'valueMap' :
3105
+
3106
+ [source,groovy]
3107
+ ----
3108
+ g.V().has('code','AUS').valueMap(true,'region')
3109
+
3110
+ [id:3,region:[US-TX],label:airport]
3111
+ ----
3112
+
3113
+ If you look at the result in the example above, collections might contain other
3114
+ collections, as shown with the region key where "US-TX" is in a 'List' . Since
3115
+ collections are core to the Gremlin language the following steps tend to be quite
3116
+ helpful in many situations:
3117
+
3118
+ [cols="^1,4"]
3119
+ |==============================================================================
3120
+ |any | Determines if any object in a 'List' matches a specified predicate
3121
+ |all | Determines if all objects in a 'List' match a specified predicate
3122
+ |combine | Combines an incoming 'List' with the list provided as an argument
3123
+ |conjoin | Takes objects in the incoming 'List' and converts them to a string joined by the specified delimiter
3124
+ |difference | Calculates the difference between the incoming 'List' and the one provided as an argument
3125
+ |disjunct | Calculates the disjunct set between the incoming 'List' and the provided 'List' argument
3126
+ |intersect | Calculates the intersection between the incoming 'List' and the provided 'List' argument
3127
+ |merge | Merges the incoming collection with the one provided as an argument
3128
+ |product | Calculates the cartesian product of the incoming 'List' and the provided 'List'
3129
+ |reverse | Reverses the order of the incoming 'List'
3130
+ |==============================================================================
3131
+
3132
+ TIP: The steps noted in the prior table are meant specifically for working with
3133
+ various collections types, but you should also note that many Gremlin steps
3134
+ inherently work with collections. For example, the 'local' versions of 'range' ,
3135
+ 'limit' , or 'tail' are good at taking parts of a collection or 'unfold' that can be
3136
+ used to deconstruct collections. These are some obvious examples, but as you learn
3137
+ more about Gremlin you will find many more.
3138
+
3139
+ Let's take a deeper look at these steps and how you might use them. We will start
3140
+ with the two filtering steps of 'any' and 'all' . Oftentimes you will find yourself in
3141
+ a situation where you have written some Gremlin that collects your results into a
3142
+ 'List' like the following example where we have the number of runways for all
3143
+ airports in the "US-TX" region collected:
3144
+
3145
+ [source,groovy]
3146
+ ----
3147
+ g.V().has('airport','region','US-TX').values('runways').fold()
3148
+
3149
+ 2,7,3,3,4,4,2,3,2,3,5,2,2,3,3,4,3,2,1,3,2,3,4,3,2,3]
3150
+ ----
3151
+
3152
+ If you were only interested in this result if the 'List' had a value of "4" in it,
3153
+ you could use 'any' and the 'P.eq' predicate:
3154
+
3155
+ [source,groovy]
3156
+ ----
3157
+ g.V().has('airport','region','US-TX').values('runways').fold().any(eq(4))
3158
+
3159
+ 2,7,3,3,4,4,2,3,2,3,5,2,2,3,3,4,3,2,1,3,2,3,4,3,2,3]
3160
+ ----
3161
+
3162
+ Similarly, if you only wanted the 'List' if all the values were greater than 5, then
3163
+ you could use 'all' :
3164
+
3165
+ [source,groovy]
3166
+ ----
3167
+ g.V().has('airport','region','US-TX').values('runways').fold().any(gt(5))
3168
+
3169
+ // no results
3170
+ ----
3171
+
3172
+ If you need to push two 'List' objects together, you can use 'combine' . You can see
3173
+ a simple example of this step demonstrated as follows where we combine the incoming
3174
+ 'List' from 'fold' with a constant 'List' containing the value of 100:
3175
+
3176
+ [source,groovy]
3177
+ ----
3178
+ g.V().has('airport','region','US-TX').values('runways').fold().combine([100])
3179
+
3180
+ 2,7,3,3,4,4,2,3,2,3,5,2,2,3,3,4,3,2,1,3,2,3,4,3,2,3,100]
3181
+ ----
3182
+
3183
+ A more advanced and likely useful form of 'combine' though is to supply a traversal
3184
+ as the argument to dynamically choose the 'List' to combine. In the following
3185
+ example, we aggregate the runway values for all of the airports in the "US-VA" region
3186
+ into "v" which is a 'List' and has the value '[1,1,2,2,2,2,3,4]' . We then traverse
3187
+ 'out' along the route edges and to 3 adjacent airport vertices. For each of those
3188
+ vertices, the 'map' step gets the runway value and does a 'fold' to produce a 'List'
3189
+ with just that value in it. As a final step, it uses 'combine' to push that single
3190
+ value 'List' together with the one stored in the aggregate "v" which we gather
3191
+ dynamically with 'select' .
3192
+
3193
+ [source,groovy]
3194
+ ----
3195
+ g.V().has('airport','region',within('US-VA')).
3196
+ aggregate('v').by('runways').
3197
+ out('route').limit(3).
3198
+ map(values('runways').fold().combine(select('v')))
3199
+
3200
+ [4,1,1,2,2,2,2,3,4]
3201
+ [4,1,1,2,2,2,2,3,4]
3202
+ [3,1,1,2,2,2,2,3,4]
3203
+ ----
3204
+
3205
+ While we won't visit this form for all the 'List' steps we look at, you will notice
3206
+ that all of these sorts of 'List' transformation steps have both a literal constant
3207
+ form and this more dynamic form that uses a traversal as the argument. We saw a
3208
+ similar pattern in <<textsteps >.
3209
+
3210
+ The 'conjoin' step is interesting in that it is the only 'map' -like step that
3211
+ transforms the incoming 'List' to a different type. When you use 'conjoin' on a
3212
+ 'List' , it will take the values within it, transform each to a string and then join
3213
+ them all together to a single string using the value that you specify as a delimiter.
3214
+ A simple use case for this feature might be to produce a delimited list of values,
3215
+ as that could be helpful for doing a fast export of data in a structured form.
3216
+
3217
+ [source,groovy]
3218
+ ----
3219
+ g.V().has('airport','region','US-VA').
3220
+ valueMap('code','longest','city').by(unfold()).
3221
+ map(select(values).conjoin('\t'))
3222
+
3223
+ SHD 6002 Staunton/Waynesboro/Harrisonburg
3224
+ ORF 9001 Norfolk
3225
+ RIC 9003 Richmond
3226
+ IAD 11500 Washington D.C.
3227
+ ROA 6800 Roanoke
3228
+ CHO 6001 Charlottesville
3229
+ PHF 8003 Newport News
3230
+ LYH 5799 Lynchburg
3231
+ ----
3232
+
3233
+ Gremlin also allows you to perform basic operations on sets with 'difference' ,
3234
+ 'disjunct' , 'interesect' and 'product' . When you use these steps, it is important to
3235
+ recognize that they each accept an incoming 'List' or 'Set' , but if it is a 'List' ,
3236
+ it will be treated as a 'Set' . The same can be said for the argument given to these
3237
+ steps. The output will be a 'Set' .
3238
+
3239
+ [source,groovy]
3240
+ ----
3241
+ g.V().has('airport','region','US-VA').values('code').fold()
3242
+
3243
+ [SHD,ORF,RIC,IAD,ROA,CHO,PHF,LYH]
3244
+
3245
+ g.V().has('airport','region','US-VA').values('code').fold().
3246
+ difference(['SHD', 'MIA'])
3247
+
3248
+ [ORF,ROA,LYH,CHO,RIC,IAD,PHF]
3249
+
3250
+ g.V().has('airport','region','US-VA').values('code').fold().
3251
+ disjunct(['SHD', 'MIA'])
3252
+
3253
+ [ORF,MIA,ROA,LYH,CHO,RIC,IAD,PHF]
3254
+
3255
+ g.V().has('airport','region','US-VA').values('code').fold().
3256
+ intersect(['SHD', 'MIA'])
3257
+
3258
+ [SHD]
3259
+
3260
+ g.V().has('airport','region','US-VA').values('code').fold().
3261
+ product(['SHD', 'MIA'])
3262
+
3263
+ [[SHD,SHD],[SHD,MIA],[ORF,SHD],[ORF,MIA],[RIC,SHD],[RIC,MIA],[IAD,SHD],[IAD,MIA],
3264
+ [ROA,SHD],[ROA,MIA],[CHO,SHD],[CHO,MIA],[PHF,SHD],[PHF,MIA],[LYH,SHD],[LYH,MIA]]
3265
+ ----
3266
+
3267
+ we should not overlook the fact that these steps can also take a traversal as an
3268
+ argument as shown in the following example that uses 'aggregate' to collect all the
3269
+ airport codes in "US-VA" into a list side-effect of "x" and then does an 'intersect'
3270
+ on that value against a list of codes for outgoing routes:
3271
+
3272
+ [source,groovy]
3273
+ ----
3274
+ g.V().has('airport','region','US-VA').aggregate('x').by('code').
3275
+ out('route').values('code').fold().
3276
+ intersect(select('x'))
3277
+
3278
+ [ORF,ROA,CHO,RIC,IAD,SHD]
3279
+ ----
3280
+
3281
+ The 'reverse' step reverses the order of an incoming object. While its typical use
3282
+ would be for a list, it could also be used on a string.
3283
+
3284
+ [source,groovy]
3285
+ ----
3286
+ g.V().has('airport','region','US-VA').values('code').fold()
3287
+
3288
+ [SHD,ORF,RIC,IAD,ROA,CHO,PHF,LYH]
3289
+
3290
+ g.V().has('airport','region','US-VA').values('code').fold().reverse()
3291
+
3292
+ [LYH,PHF,CHO,ROA,IAD,RIC,ORF,SHD]
3293
+ ----
3294
+
3295
+ When you need to merge two collections together you can use 'merge' step which works
3296
+ for lists, sets and maps. This step is similar to 'combine' , but does not allow
3297
+ duplicates.
3298
+
3299
+ [source,groovy]
3300
+ ----
3301
+ g.V().has('airport','region','US-VA').values('code').fold().merge(['MIA','IAD'])
3302
+
3303
+ [ORF,MIA,ROA,LYH,CHO,RIC,IAD,SHD,PHF]
3304
+
3305
+ g.V().has('airport','region','US-VA').valueMap('code').merge([x: 10])
3306
+
3307
+ [x:10,code:[SHD]]
3308
+ [x:10,code:[ORF]]
3309
+ [x:10,code:[RIC]]
3310
+ [x:10,code:[IAD]]
3311
+ [x:10,code:[ROA]]
3312
+ [x:10,code:[CHO]]
3313
+ [x:10,code:[PHF]]
3314
+ [x:10,code:[LYH]]
3315
+ ----
3316
+
3317
+ Once again, you may also provide a traversal argument to 'merge' to dynamically
3318
+ provide the object to merge to the incoming one.
3319
+
3320
+ [source,groovy]
3321
+ ----
3322
+ g.V().has('airport','region','US-VA').aggregate('x').by('code').
3323
+ out('route').limit(5).values('code').fold().
3324
+ merge(select('x'))
3325
+
3326
+ [ORF,DCA,DFW,PHL,ROA,CLT,LYH,CHO,IAD,RIC,SHD,PHF]
3327
+ ----
3328
+
3329
+ In the prior example, we gather the airport codes in "x" and then traverse 'out'
3330
+ along route edges and fold 5 of them to a list which we then 'merge' to "x" providing
3331
+ a single list output of both lists without duplicates.
3332
+
3333
+ The collection steps provide important utility functions for Gremlin. They allow you
3334
+ to do helpful transformations to your data that can get your final result into the
3335
+ form your application ultimately needs. They also help make Gremlin queries read more
3336
+ concisely where they often replace patterns of other Gremlin steps that do the same
3337
+ job making queries much more readable.
3338
+
3087
3339
[[datesteps]]
3088
3340
Working with dates
3089
3341
~~~~~~~~~~~~~~~~~~
0 commit comments