@@ -936,11 +936,241 @@ by 1 for each call to 'addV' creating an auto-incremeting identifier. This techn
936
936
for creating vertices is often helpful when there is a need to generate some vertices
937
937
for testing.
938
938
939
+ [[upsert]]
940
+ Using 'mergeV' and 'mergeE' for upserting
941
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
942
+
943
+ Checking to see if a vertex or edge already exists and then either inserting that
944
+ element if it does not or updating the element that is found, is a common database
945
+ access pattern often referred to as an 'upsert' . There are a variety of ways to
946
+ implement an upsert in Gremlin, but the 'mergeV' and 'mergeE' steps are designed
947
+ specifically for this purpose and offer a wide degree of flexiblity in this task.
948
+
949
+ Let's assume we wanted to add a new airport, with the code '"XYZ"' but we are not
950
+ sure if the airport might have already been added. We can do all of those things
951
+ using the most basic form of 'mergeV' :
952
+
953
+ [source,groovy]
954
+ ----
955
+ g.mergeV([code:'XYZ'])
956
+
957
+ v[53865]
958
+ ----
959
+
960
+ As you can see, 'mergeV' takes a 'Map' as an argument where the keys and values
961
+ become the search criteria for the vertex. In the above case, it is similar to
962
+ writing `has("code","XYZ")' to find a vertex. If 'mergeV' finds a vertex with that
963
+ code and value, it return it. However, 'mergeV' has a dual purpose. Should the vertex
964
+ not be found, then 'mergeV' will automatically create a new vertex with a "code" of
965
+ "XYZ" and return that. Evaluating that same query again will result in returning the
966
+ existing 'v[53865]' .
967
+
968
+ You may match on as many properties as necessary including 'T.id' and 'T.label' :
969
+
970
+ [source,groovy]
971
+ ----
972
+ g.mergeV([(T.label): 'airport', (T.id): 999999, code:'XYZ'])
973
+
974
+ v[999999]
975
+ ----
976
+
977
+ Note that the match must be complete in that all keys designated in the 'Map' must
978
+ match or else a new vertex will be created. In the example above, there is a vertex
979
+ present that has a code of '"XYZ"' , but the one we created initially lacks a
980
+ 'T.label' of "airport" and has a different 'T.id' all together.
981
+
982
+ We do not always want to utilize the same criteria for searching as we do creating.
983
+ Most commonly we tend to search by the element identifier, which is the fastest way
984
+ to check for existence. For these cases you will only want to provide that identifier
985
+ to the 'Map' given to 'mergeV' and separately specify the properties you want to use
986
+ to create the vertex if it is not found. You can do this with the 'option' modulator
987
+ and the 'Merge' enum.
988
+
989
+ [source,groovy]
990
+ ----
991
+ g.mergeV([(T.id): 999999]).
992
+ option(Merge.onCreate, [(T.label): 'airport', code:'XYZ'])
993
+
994
+ v[999999]
995
+ ----
996
+
997
+ In this prior example, 'mergeV' will match on the vertex identifier of '999999' and
998
+ will return it if found. Otherwise, it will defer to the 'onCreate' action which
999
+ specifies a 'Map' of keys and values to use to create the new vertex. The 'onCreate'
1000
+ 'Map' inherits keys and values from the search map and would therefore include the
1001
+ '999999' in the creation of the vertex.
1002
+
1003
+ As we have a 'onCreate' to deal with the "does not exist" path for the search, we
1004
+ also have a 'onMatch' to allow more control over the "does exist" path to update an
1005
+ existing vertex. For example, if we wanted to change the code to update the code
1006
+ property to a lower-case '"xyz"' in the event the vertex exists, we could write the
1007
+ following:
1008
+
1009
+ [source,groovy]
1010
+ ----
1011
+ g.mergeV([(T.id): 999999]).
1012
+ option(Merge.onCreate, [(T.label): 'airport', code:'XYZ'])
1013
+ option(Merge.onMatch, [code:'xyz'])
1014
+
1015
+ v[999999]
1016
+ ----
1017
+
1018
+ At this point we've demonstrated the general form of 'mergeV' where you give it a
1019
+ 'Map' of key/value pairs to use for a search and then use 'option' modulators to
1020
+ provide specific data to use to either create or update based on the initial match.
1021
+ For more advanced use cases, we have some additional flexibility in how we supply the
1022
+ 'Map' arguments as they may all be supplied by way of a 'Traversal' :
1023
+
1024
+ [source,groovy]
1025
+ ----
1026
+ g.withSideEffect('search', [(T.id): 999998]).
1027
+ withSideEffect('create', [code: 'ZYX', (T.label): 'airport']).
1028
+ withSideEffect('match', [code: 'zyx']).
1029
+ mergeV(select('search')).
1030
+ option(Merge.onCreate, select('create')).
1031
+ option(Merge.onMatch, select('match'))
1032
+
1033
+ v[999998]
1034
+ ----
1035
+
1036
+ The prior example shows how you can use a step like 'select' to dynamically provide
1037
+ a 'Map' argument to the various 'mergeV' parameters.
1038
+
1039
+ The counterpart to 'mergeV' is 'mergeE' where the same patterns can be used to upsert
1040
+ edges. Edges have a bit more complexity in that addition to the property values they
1041
+ may have, they also have an 'IN' and an 'OUT' (or 'from' and 'to' ) vertex that
1042
+ applies to them. Let's assume we want to look for an existing route edge between the
1043
+ '"XYZ"' airport that we just added and '"DFW"' . If we find that edge with that search
1044
+ criteria we simply return it, but if it is not found then we will create it with a
1045
+ dist property of 0. To do this we need the vertex ids that this edge relates to. We
1046
+ can then use those ids to reference them in the search criteria:
1047
+
1048
+ [source,groovy]
1049
+ ----
1050
+ g.V().has('code','DFW').id()
1051
+
1052
+ 8
1053
+
1054
+ g.V().has('code','XYZ').id()
1055
+
1056
+ 999999
1057
+
1058
+ g.mergeE([(T.label): 'route', (Direction.from): 8, (Direction.to): 999999]).
1059
+ option(Merge.onCreate, [dist: 0])
1060
+
1061
+ e[41224][8-route->999999]
1062
+
1063
+ g.E(41224L).elementMap()
1064
+
1065
+ [id:41224,label:route,IN:[id:999999,label:airport],OUT:[id:8,label:airport],dist:0]
1066
+ ----
1067
+
1068
+ It is not possible to create new vertices from 'mergeE' automatically. They must
1069
+ already exist for the edge to be created. In addition to 'onMatch' and 'onCreate' ,
1070
+ 'mergeE' also allows for two other 'option' modulators: 'outV' and 'inV' , which lets
1071
+ you specify search criteria as a 'Map' for the 'from' and the 'to' of the edge. Using
1072
+ that capability we could rewrite the previous example as follows:
1073
+
1074
+ [source,groovy]
1075
+ ----
1076
+ g.mergeE([(T.label): 'route', (Direction.from): Merge.outV, (Direction.to): Merge.inV]).
1077
+ option(Merge.outV, [code: 'DFW']).
1078
+ option(Merge.inV, [code: 'XYZ']).
1079
+ option(Merge.onCreate, [dist: 0])
1080
+
1081
+
1082
+ e[41224][8-route->999999]
1083
+ ----
1084
+
1085
+ This section introduced the basics of the 'mergeV' and 'mergeE' steps, which provide
1086
+ a powerful way in which to encapsulate upsert logics for vertices and edges. As you
1087
+ continue through the following sections you will find more advanced features with
1088
+ these steps.
1089
+
1090
+ Incorporating 'fail' with 'mergeV' or 'mergeE'
1091
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1092
+
1093
+ We learned about 'fail' step in <<coalfail>> where it will throw an exception when it
1094
+ is encountered. You can incorporate this step into 'mergeV' and 'mergeE' to stop a
1095
+ traversal in cases where you don't want the traversal to continue in 'onCreate' or
1096
+ 'onMatch' .
1097
+
1098
+ Let's envision a case where you do not expect the '"XYZ"' airport to be present in
1099
+ the graph and that if it is you do not want the query to proceed any further in its
1100
+ processing. Since the 'option' modulator takes a traversal as an argument, you can
1101
+ give it a fail step for its 'onMatch' .
1102
+
1103
+ [source,groovy]
1104
+ ----
1105
+ g.mergeV([code: 'XYZ']).
1106
+ option(Merge.onCreate, [(T.label): 'airport', code:'XYZ']).
1107
+ option(Merge.onMatch, fail('XYZ airport already exists'))
1108
+
1109
+ fail() Step Triggered
1110
+ =========================================================================================================================================================================
1111
+ Message > XYZ airport already exists
1112
+ Traverser> v[999999]
1113
+ Bulk > 1
1114
+ Traversal> fail("XYZ airport already exists")
1115
+ Parent > MergeVertexStep [mergeV(["code":"XYZ"]).option(Merge.onCreate,[(T.label):"airport","code":"XYZ"]).option(Merge.onMatch,__.fail("XYZ airport already exists"))]
1116
+ Metadata > {}
1117
+ =========================================================================================================================================================================
1118
+ ----
1119
+
1120
+ Note that the output above is the string representation of a 'FailException' as
1121
+ printed in Gremlin Console.
1122
+
1123
+ Specifying cardinality with 'mergeV'
1124
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1125
+
1126
+ Most of the time, vertex properties tend to be modelled with 'single' cardinality.
1127
+ However, you may have situations where other cardinalities are used or are using a
1128
+ graph that defaults to a cardinality other than 'single' . The 'mergeV' step provides
1129
+ two ways to explicitly set the cardinality for the properties given to it and both
1130
+ make use of the 'Cardinality' enum. The first way to do this is to explcitly set the
1131
+ cardinality per property value. For example, let's imagine that you want to use 'set'
1132
+ cardinality for the code property.
1133
+
1134
+ [source,groovy]
1135
+ ----
1136
+ g.mergeV([(T.id): 999999]).
1137
+ option(Merge.onCreate, [(T.label): 'airport', code: set('XYZ')])
1138
+
1139
+ v[999999]
1140
+ ----
1141
+
1142
+ By wrapping the '"XYZ"' in the 'set()' , which is statically imported from
1143
+ 'Cardinality.set' , you mark that value in a way that 'mergeV' knows to tell the graph
1144
+ to use that specific cardinality for that property. You could also do set this value
1145
+ globally for the step by providing it as an argument after the 'Map' :
1146
+
1147
+ [source,groovy]
1148
+ ----
1149
+ g.mergeV([(T.id): 999999]).
1150
+ option(Merge.onCreate, [(T.label): 'airport', code: 'XYZ'], Cardinality.set)
1151
+
1152
+ v[999999]
1153
+ ----
1154
+
1155
+ When you offer 'set' this way, 'mergeV' will assume all property keys to use this
1156
+ cardinality. If there is an explicit cardinality specified, it will override the
1157
+ global setting.
1158
+
939
1159
[[coaladdv]]
940
- Using 'coalesce' to only add a vertex if it does not exist
941
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1160
+ [[upsert]]
1161
+ When you need more flexibility than 'mergeV' and 'mergeE'
1162
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1163
+
1164
+ The 'mergeV' and 'mergeE' steps offer a wide breath of options to encapsulate
1165
+ upsert-like logic with varying mechanisms for matching, creating and updating. While
1166
+ these steps are extremely flexible, you may yet find a scenario where they do not do
1167
+ everything you require. In these cases, you may fall back to a lower order of Gremlin
1168
+ steps that can offer more options, but be a bit more complex to implement and
1169
+ possibly lack the same performance capability as different graph providers may not be
1170
+ able to optimize these queries as well as the more directly purposed 'mergeV' and
1171
+ 'mergeE' steps.
942
1172
943
- In the " <<coalconst>>" section we looked at how coalesce could be used to return a
1173
+ In the <<coalconst>> section we looked at how coalesce could be used to return a
944
1174
constant value if the other entities that we were looking for did not exist. We can
945
1175
reuse that pattern to produce a traversal that will only add a vertex to the graph if
946
1176
that vertex has not already been created.
@@ -995,10 +1225,6 @@ g.V().has('code','XYZ').fold().coalesce(unfold(),addV().property('code','XYZ'))
995
1225
v[53865]
996
1226
----
997
1227
998
- [[upsert]]
999
- Using 'coalesce' to derive an 'upsert' pattern
1000
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1001
-
1002
1228
Using 'coalesce' in this way provides us with a nice pattern for a commonly performed
1003
1229
task of checking to see if something already exists before we try to update it and
1004
1230
otherwise create it. This is often called an '"upsert"' pattern as the operation
@@ -3077,8 +3303,8 @@ g.V().has('code','DFW').group().by().by(outE().count()).
3077
3303
DFW
3078
3304
----
3079
3305
3080
- So, lets now add to our query and create a new vertex with a label 'dfwcount' that is
3081
- going to store the number of routes originating in DFW using a property called
3306
+ So, let's now add to our query and create a new vertex with a label 'dfwcount' that
3307
+ is going to store the number of routes originating in DFW using a property called
3082
3308
'current_count' .
3083
3309
3084
3310
[source,groovy]
0 commit comments