Skip to content

Commit 9877bfc

Browse files
committed
Additional layout and wording improvements to section 5
1 parent 4c807fe commit 9877bfc

File tree

1 file changed

+85
-34
lines changed

1 file changed

+85
-34
lines changed

book/Gremlin-Graph-Guide.adoc

+85-34
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
PRACTICAL GREMLIN: An Apache TinkerPop Tutorial
22
===============================================
33
Kelvin R. Lawrence <gfxman@yahoo.com>
4-
v278-preview, Mar 26, 2018
5-
// Mon Mar 26, 2018 17:00:31 CDT
4+
v278-preview, Mar 27, 2018
5+
// Tue Mar 27, 2018 08:14:10 CDT
66
//:Author: Kelvin R. Lawrence
77
//:Email: gfxman@yahoo.com
8-
//:Date: Mar 26 2018
8+
//:Date: Mar 27 2018
99
:Numbered:
1010
:source-highlighter: pygments
1111
:pygments-style: paraiso-dark
@@ -11921,17 +11921,23 @@ interested in as follows. Only airports with 105 outgoing routes are selected.
1192111921
// Which of these airports have 105 outgoing routes?
1192211922
g.V().hasLabel('airport').
1192311923
group().by(out().count()).by('code').next().get(105L)
11924+
----
11925+
11926+
NOTE: Currently 'select' can only take a string value as the key so we have to
11927+
use the slightly awkward 'next().get()' syntax to get a numeric key from a
11928+
result.
1192411929

11930+
This time the results only include the airports with 105 outgoing routes.
11931+
11932+
[source,groovy]
11933+
----
1192511934
PHX
1192611935
MEX
1192711936
TLV
1192811937
HAM
1192911938
XIY
1193011939
----
1193111940

11932-
NOTE: Currently 'select' can only take a string value as the key so we have to
11933-
use the slightly awkward 'next().get()' syntax to get a numeric key from a
11934-
result.
1193511941

1193611942
[[groupvar]]
1193711943
Using groupCount with a traversal variable
@@ -12139,7 +12145,10 @@ between airports anywhere in Europe and airports in the USA.
1213912145
[source,groovy]
1214012146
----
1214112147
// How many routes from anywhere in Europe to the USA?
12142-
g.V().has('continent','code','EU').out().out().has('country','US').count()
12148+
12149+
g.V().has('continent','code','EU').
12150+
out().out().has('country','US').
12151+
count()
1214312152

1214412153
351
1214512154
----
@@ -12153,23 +12162,33 @@ US airports have flights that arrive from Europe.
1215312162
[source,groovy]
1215412163
----
1215512164
// How many different US airports have routes from Europe?
12156-
g.V().has('continent','code','EU').out().out().has('country','US').dedup().count()
12165+
12166+
g.V().has('continent','code','EU').
12167+
out().out().has('country','US').
12168+
dedup().count()
1215712169

1215812170
38
1215912171
----
1216012172

1216112173
So we can now see that the 345 routes from European airports arrive at one of 38
1216212174
airports in the United States. We can dig a bit deeper and look at the distribution
12163-
of these routes across the 38 airports. John F. Kennedy airport (JFK) in New York
12164-
appears to have the most routes from Europe with Newark (EWR) having the second most.
12175+
of these routes across the 38 airports.
1216512176

1216612177
[source,groovy]
1216712178
----
1216812179
//What is the distribution of the routes amongst those US airports?
1216912180

12170-
g.V().has('continent','code','EU').out().out().has('country','US').
12171-
groupCount().by('code').order(local).by(values,incr)
12181+
g.V().has('continent','code','EU').
12182+
out().out().has('country','US').
12183+
groupCount().by('code').
12184+
order(local).by(values,incr)
12185+
----
12186+
12187+
John F. Kennedy airport (JFK) in New York appears to have the most routes from Europe
12188+
with Newark (EWR) having the second most.
1217212189

12190+
[source,groovy]
12191+
----
1217312192
[PHX:1,CVG:1,RSW:1,BDL:2,SJC:2,BWI:2,AUS:2,RDU:2,MSY:2,SAN:3,SLC:3,PDX:3,PIT:3,TPA:4,SFB:4,OAK:4,DTW:5,SWF:5,MSP:5,DEN:5,FLL:6,CLT:7,DFW:7,PVD:7,SEA:8,IAH:8,MCO:10,LAS:10,ATL:14,SFO:15,PHL:17,IAD:19,ORD:21,BOS:22,LAX:23,MIA:25,EWR:33,JFK:40]
1217412193
----
1217512194

@@ -12179,35 +12198,58 @@ all, we can calculate how many European airports have flights to the United Stat
1217912198
[source,groovy]
1218012199
----
1218112200
// How many European airports have service to the USA?
12182-
g.V().has('continent','code','EU').out().as('a').out().has('country','US').
12201+
12202+
g.V().has('continent','code','EU').
12203+
out().as('a').
12204+
out().has('country','US').
1218312205
select('a').dedup().count()
1218412206

1218512207
53
1218612208
----
1218712209

1218812210
Just as we did for the airports in the US we can figure out the distribution of
12189-
routes for the European airports. It appears that London Heathrow (LHR) offers the
12190-
most US destinations and Frankfurt (FRA) the second most.
12211+
routes for the European airports.
1219112212

1219212213
[source,groovy]
1219312214
----
12194-
//What is the distribution of these routes amongst the European airports?
12195-
g.V().has('continent','code','EU').out().as('a').out().has('country','US').
12196-
select('a').groupCount().by('code').order(local).by(values,incr)
12215+
// What is the distribution of US routes amongst
12216+
// the European airports?
12217+
12218+
g.V().has('continent','code','EU').
12219+
out().as('a').
12220+
out().has('country','US').
12221+
select('a').groupCount().by('code').
12222+
order(local).by(values,incr)
12223+
----
12224+
12225+
It appears that London Heathrow (LHR) offers the
12226+
most US destinations and Frankfurt (FRA) the second most.
1219712227

12228+
[source,groovy]
12229+
----
1219812230
[RIX:1,TER:1,BRS:1,STN:1,NCE:1,KRK:1,ORK:1,KBP:1,PDL:1,DME:1,BEG:1,AGP:1,HAM:1,OPO:2,STR:2,VCE:2,BHX:2,ORY:2,ATH:2,MXP:3,BFS:3,BGO:3,GVA:3,VKO:3,HEL:3,WAW:4,SVO:4,GLA:4,EDI:5,SNN:5,CGN:5,TXL:6,VIE:6,LIS:6,BRU:7,ARN:7,OSL:8,IST:9,BCN:10,MAD:11,LGW:11,DUS:11,CPH:11,FCO:12,ZRH:13,MAN:13,DUB:15,MUC:16,AMS:18,KEF:18,CDG:22,FRA:24,LHR:27]
1219912231
----
1220012232

12201-
Lastly, we can find out what the list of routes flown is. For this example we just
12202-
return 10 of the 345 routes. Note how the 'path' step returns all parts of the
12203-
traversal including the continent code 'EU'.
12233+
Lastly, we can find out what the list of routes flown is. For this example I decided
12234+
to just return 10 of the 345 routes. Note how the 'path' step returns all parts of
12235+
the traversal including the continent code 'EU'. We could remove that part of the
12236+
result by adding a 'from' modulator as shown earlier in the "<<pathintro>>" section.
1220412237

1220512238
[source,groovy]
1220612239
----
12207-
// What are some of these routes?
12208-
g.V().has('continent','code','EU').out().out().
12209-
has('country','US').path().by('code').limit(10)
12240+
// Selected routes from Europe to the USA.
12241+
12242+
g.V().has('continent','code','EU').
12243+
out().out().
12244+
has('country','US').
12245+
path().by('code').
12246+
limit(10)
12247+
----
1221012248

12249+
The first 10 results returned feature routes from Warsaw, Belgrade and Istanbul.
12250+
12251+
[source,groovy]
12252+
----
1221112253
[EU,WAW,JFK]
1221212254
[EU,WAW,LAX]
1221312255
[EU,WAW,ORD]
@@ -12228,25 +12270,34 @@ Earlier in the book we saw examples of 'sum' being used to count a
1222812270
collection of values. You can also use 'fold' to do something similar but in a
1222912271
more 'map-reduce' type of fashion.
1223012272

12231-
First of all, here is a query that uses 'fold' in a way that we have already
12232-
seen. We find all routes from Austin and use 'fold' to return a nice list of
12233-
those names.
12273+
First of all, here is a query that uses 'fold' in a way that we have already seen. It
12274+
will find all routes from Austin and uses a 'fold' step to return a list of those
12275+
names.
1223412276

1223512277
[source,groovy]
1223612278
----
12237-
g.V().has('code','AUS').out('route').values('city').fold()
12279+
g.V().has('code','AUS').
12280+
out('route').
12281+
values('city').fold()
12282+
----
12283+
12284+
As expected the results show all of the cities that you can fly to from Austin
12285+
collected into a single list.
1223812286

12287+
[source,groovy]
12288+
----
1223912289
[Toronto,London,Frankfurt,Mexico City,Pittsburgh,Portland,Charlotte,Cancun,Memphis,Cincinnati,Indianapolis,Kansas City,Dallas,St Louis,Albuquerque,Chicago,Lubbock,Harlingen,Guadalajara,Pensacola,Valparaiso,Orlando,Branson,St Petersburg-Clearwater,Atlanta,Nashville,Boston,Baltimore,Washington D.C.,Dallas,Fort Lauderdale,Washington D.C.,Houston,New York,Los Angeles,Orlando,Miami,Minneapolis,Chicago,Phoenix,Raleigh,Seattle,San Francisco,San Jose,Tampa,San Diego,Long Beach,Santa Ana,Salt Lake City,Las Vegas,Denver,New Orleans,Newark,Houston,El Paso,Cleveland,Oakland,Philadelphia,Detroit]
1224012290
----
1224112291

1224212292
However, what if we wanted to reduce our results further? Take a look at the modified
1224312293
version of our query below. It finds all routes from Austin and looks at the names of
12244-
the destination cities. However, rather than return all the names, 'fold' is used to
12245-
reduce the names to a value. That value being the total number of characters in all
12246-
of those city names. We have seen 'fold' used elsewhere in the book but this time
12247-
we provide 'fold' with a parameter and a closure. The parameter is passed to the
12248-
closure as the first variable and the name of the city as the second. The closure
12249-
then adds the zero and the length of each name effectively to a running total.
12294+
the destination cities. However, rather than return all the names, this time the
12295+
'fold' step is used differently and effectively reduces the city names to a single
12296+
value. That value being the total number of characters in all of those city names. We
12297+
have seen 'fold' used elsewhere in the book but this time we provide 'fold' with a
12298+
parameter and a closure. The parameter is passed to the closure as the first variable
12299+
and the name of the city as the second. The closure then adds the zero and the length
12300+
of each name effectively producing a running total.
1225012301

1225112302
[source,groovy]
1225212303
----

0 commit comments

Comments
 (0)