Skip to content

Commit b109241

Browse files
committed
Added a second for Working with Collections
1 parent 4ae4c24 commit b109241

File tree

1 file changed

+252
-0
lines changed

1 file changed

+252
-0
lines changed

book/Section-Writing-Gremlin-Queries.adoc

+252
Original file line numberDiff line numberDiff line change
@@ -3084,6 +3084,258 @@ g.V().has('airport','code',within('IAD','MIA','LAX')).
30843084
[m]
30853085
----
30863086

3087+
[[collectionsteps]]
3088+
Working with collections
3089+
~~~~~~~~~~~~~~~~~~~~~~~~
3090+
3091+
Collections are containers for other values and refer to things like 'List', 'Set'
3092+
and 'Map'. Gremlin offers a variety of steps that help construct and manipulate these
3093+
objects. We've already seen how we can use a step like 'fold' to create a 'List' of
3094+
the objects in the traversal stream:
3095+
3096+
[source, groovy]
3097+
----
3098+
g.V().has('region','GB-ENG').values('runways').dedup().fold()
3099+
3100+
[2,1,3,4]
3101+
----
3102+
3103+
As for maps, Gremlin can produce them whenever you use a step like 'group' or
3104+
'valueMap':
3105+
3106+
[source,groovy]
3107+
----
3108+
g.V().has('code','AUS').valueMap(true,'region')
3109+
3110+
[id:3,region:[US-TX],label:airport]
3111+
----
3112+
3113+
If you look at the result in the example above, collections might contain other
3114+
collections, as shown with the region key where "US-TX" is in a 'List'. Since
3115+
collections are core to the Gremlin language the following steps tend to be quite
3116+
helpful in many situations:
3117+
3118+
[cols="^1,4"]
3119+
|==============================================================================
3120+
|any | Determines if any object in a 'List' matches a specified predicate
3121+
|all | Determines if all objects in a 'List' match a specified predicate
3122+
|combine | Combines an incoming 'List' with the list provided as an argument
3123+
|conjoin | Takes objects in the incoming 'List' and converts them to a string joined by the specified delimiter
3124+
|difference | Calculates the difference between the incoming 'List' and the one provided as an argument
3125+
|disjunct | Calculates the disjunct set between the incoming 'List' and the provided 'List' argument
3126+
|intersect | Calculates the intersection between the incoming 'List' and the provided 'List' argument
3127+
|merge | Merges the incoming collection with the one provided as an argument
3128+
|product | Calculates the cartesian product of the incoming 'List' and the provided 'List'
3129+
|reverse | Reverses the order of the incoming 'List'
3130+
|==============================================================================
3131+
3132+
TIP: The steps noted in the prior table are meant specifically for working with
3133+
various collections types, but you should also note that many Gremlin steps
3134+
inherently work with collections. For example, the 'local' versions of 'range',
3135+
'limit', or 'tail' are good at taking parts of a collection or 'unfold' that can be
3136+
used to deconstruct collections. These are some obvious examples, but as you learn
3137+
more about Gremlin you will find many more.
3138+
3139+
Let's take a deeper look at these steps and how you might use them. We will start
3140+
with the two filtering steps of 'any' and 'all'. Oftentimes you will find yourself in
3141+
a situation where you have written some Gremlin that collects your results into a
3142+
'List' like the following example where we have the number of runways for all
3143+
airports in the "US-TX" region collected:
3144+
3145+
[source,groovy]
3146+
----
3147+
g.V().has('airport','region','US-TX').values('runways').fold()
3148+
3149+
2,7,3,3,4,4,2,3,2,3,5,2,2,3,3,4,3,2,1,3,2,3,4,3,2,3]
3150+
----
3151+
3152+
If you were only interested in this result if the 'List' had a value of "4" in it,
3153+
you could use 'any' and the 'P.eq' predicate:
3154+
3155+
[source,groovy]
3156+
----
3157+
g.V().has('airport','region','US-TX').values('runways').fold().any(eq(4))
3158+
3159+
2,7,3,3,4,4,2,3,2,3,5,2,2,3,3,4,3,2,1,3,2,3,4,3,2,3]
3160+
----
3161+
3162+
Similarly, if you only wanted the 'List' if all the values were greater than 5, then
3163+
you could use 'all':
3164+
3165+
[source,groovy]
3166+
----
3167+
g.V().has('airport','region','US-TX').values('runways').fold().any(gt(5))
3168+
3169+
// no results
3170+
----
3171+
3172+
If you need to push two 'List' objects together, you can use 'combine'. You can see
3173+
a simple example of this step demonstrated as follows where we combine the incoming
3174+
'List' from 'fold' with a constant 'List' containing the value of 100:
3175+
3176+
[source,groovy]
3177+
----
3178+
g.V().has('airport','region','US-TX').values('runways').fold().combine([100])
3179+
3180+
2,7,3,3,4,4,2,3,2,3,5,2,2,3,3,4,3,2,1,3,2,3,4,3,2,3,100]
3181+
----
3182+
3183+
A more advanced and likely useful form of 'combine' though is to supply a traversal
3184+
as the argument to dynamically choose the 'List' to combine. In the following
3185+
example, we aggregate the runway values for all of the airports in the "US-VA" region
3186+
into "v" which is a 'List' and has the value '[1,1,2,2,2,2,3,4]'. We then traverse
3187+
'out' along the route edges and to 3 adjacent airport vertices. For each of those
3188+
vertices, the 'map' step gets the runway value and does a 'fold' to produce a 'List'
3189+
with just that value in it. As a final step, it uses 'combine' to push that single
3190+
value 'List' together with the one stored in the aggregate "v" which we gather
3191+
dynamically with 'select'.
3192+
3193+
[source,groovy]
3194+
----
3195+
g.V().has('airport','region',within('US-VA')).
3196+
aggregate('v').by('runways').
3197+
out('route').limit(3).
3198+
map(values('runways').fold().combine(select('v')))
3199+
3200+
[4,1,1,2,2,2,2,3,4]
3201+
[4,1,1,2,2,2,2,3,4]
3202+
[3,1,1,2,2,2,2,3,4]
3203+
----
3204+
3205+
While we won't visit this form for all the 'List' steps we look at, you will notice
3206+
that all of these sorts of 'List' transformation steps have both a literal constant
3207+
form and this more dynamic form that uses a traversal as the argument. We saw a
3208+
similar pattern in <<textsteps>.
3209+
3210+
The 'conjoin' step is interesting in that it is the only 'map'-like step that
3211+
transforms the incoming 'List' to a different type. When you use 'conjoin' on a
3212+
'List', it will take the values within it, transform each to a string and then join
3213+
them all together to a single string using the value that you specify as a delimiter.
3214+
A simple use case for this feature might be to produce a delimited list of values,
3215+
as that could be helpful for doing a fast export of data in a structured form.
3216+
3217+
[source,groovy]
3218+
----
3219+
g.V().has('airport','region','US-VA').
3220+
valueMap('code','longest','city').by(unfold()).
3221+
map(select(values).conjoin('\t'))
3222+
3223+
SHD 6002 Staunton/Waynesboro/Harrisonburg
3224+
ORF 9001 Norfolk
3225+
RIC 9003 Richmond
3226+
IAD 11500 Washington D.C.
3227+
ROA 6800 Roanoke
3228+
CHO 6001 Charlottesville
3229+
PHF 8003 Newport News
3230+
LYH 5799 Lynchburg
3231+
----
3232+
3233+
Gremlin also allows you to perform basic operations on sets with 'difference',
3234+
'disjunct', 'interesect' and 'product'. When you use these steps, it is important to
3235+
recognize that they each accept an incoming 'List' or 'Set', but if it is a 'List',
3236+
it will be treated as a 'Set'. The same can be said for the argument given to these
3237+
steps. The output will be a 'Set'.
3238+
3239+
[source,groovy]
3240+
----
3241+
g.V().has('airport','region','US-VA').values('code').fold()
3242+
3243+
[SHD,ORF,RIC,IAD,ROA,CHO,PHF,LYH]
3244+
3245+
g.V().has('airport','region','US-VA').values('code').fold().
3246+
difference(['SHD', 'MIA'])
3247+
3248+
[ORF,ROA,LYH,CHO,RIC,IAD,PHF]
3249+
3250+
g.V().has('airport','region','US-VA').values('code').fold().
3251+
disjunct(['SHD', 'MIA'])
3252+
3253+
[ORF,MIA,ROA,LYH,CHO,RIC,IAD,PHF]
3254+
3255+
g.V().has('airport','region','US-VA').values('code').fold().
3256+
intersect(['SHD', 'MIA'])
3257+
3258+
[SHD]
3259+
3260+
g.V().has('airport','region','US-VA').values('code').fold().
3261+
product(['SHD', 'MIA'])
3262+
3263+
[[SHD,SHD],[SHD,MIA],[ORF,SHD],[ORF,MIA],[RIC,SHD],[RIC,MIA],[IAD,SHD],[IAD,MIA],
3264+
[ROA,SHD],[ROA,MIA],[CHO,SHD],[CHO,MIA],[PHF,SHD],[PHF,MIA],[LYH,SHD],[LYH,MIA]]
3265+
----
3266+
3267+
we should not overlook the fact that these steps can also take a traversal as an
3268+
argument as shown in the following example that uses 'aggregate' to collect all the
3269+
airport codes in "US-VA" into a list side-effect of "x" and then does an 'intersect'
3270+
on that value against a list of codes for outgoing routes:
3271+
3272+
[source,groovy]
3273+
----
3274+
g.V().has('airport','region','US-VA').aggregate('x').by('code').
3275+
out('route').values('code').fold().
3276+
intersect(select('x'))
3277+
3278+
[ORF,ROA,CHO,RIC,IAD,SHD]
3279+
----
3280+
3281+
The 'reverse' step reverses the order of an incoming object. While its typical use
3282+
would be for a list, it could also be used on a string.
3283+
3284+
[source,groovy]
3285+
----
3286+
g.V().has('airport','region','US-VA').values('code').fold()
3287+
3288+
[SHD,ORF,RIC,IAD,ROA,CHO,PHF,LYH]
3289+
3290+
g.V().has('airport','region','US-VA').values('code').fold().reverse()
3291+
3292+
[LYH,PHF,CHO,ROA,IAD,RIC,ORF,SHD]
3293+
----
3294+
3295+
When you need to merge two collections together you can use 'merge' step which works
3296+
for lists, sets and maps. This step is similar to 'combine', but does not allow
3297+
duplicates.
3298+
3299+
[source,groovy]
3300+
----
3301+
g.V().has('airport','region','US-VA').values('code').fold().merge(['MIA','IAD'])
3302+
3303+
[ORF,MIA,ROA,LYH,CHO,RIC,IAD,SHD,PHF]
3304+
3305+
g.V().has('airport','region','US-VA').valueMap('code').merge([x: 10])
3306+
3307+
[x:10,code:[SHD]]
3308+
[x:10,code:[ORF]]
3309+
[x:10,code:[RIC]]
3310+
[x:10,code:[IAD]]
3311+
[x:10,code:[ROA]]
3312+
[x:10,code:[CHO]]
3313+
[x:10,code:[PHF]]
3314+
[x:10,code:[LYH]]
3315+
----
3316+
3317+
Once again, you may also provide a traversal argument to 'merge' to dynamically
3318+
provide the object to merge to the incoming one.
3319+
3320+
[source,groovy]
3321+
----
3322+
g.V().has('airport','region','US-VA').aggregate('x').by('code').
3323+
out('route').limit(5).values('code').fold().
3324+
merge(select('x'))
3325+
3326+
[ORF,DCA,DFW,PHL,ROA,CLT,LYH,CHO,IAD,RIC,SHD,PHF]
3327+
----
3328+
3329+
In the prior example, we gather the airport codes in "x" and then traverse 'out'
3330+
along route edges and fold 5 of them to a list which we then 'merge' to "x" providing
3331+
a single list output of both lists without duplicates.
3332+
3333+
The collection steps provide important utility functions for Gremlin. They allow you
3334+
to do helpful transformations to your data that can get your final result into the
3335+
form your application ultimately needs. They also help make Gremlin queries read more
3336+
concisely where they often replace patterns of other Gremlin steps that do the same
3337+
job making queries much more readable.
3338+
30873339
[[datesteps]]
30883340
Working with dates
30893341
~~~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)