Skip to content

Commit a80e152

Browse files
committed
CMR-7312: As a CMR client user, I would like to see the API doc on graph db related endpoints and features.
1 parent 269bc78 commit a80e152

File tree

1 file changed

+60
-0
lines changed

1 file changed

+60
-0
lines changed

graph-db/README.md

+60
Original file line numberDiff line numberDiff line change
@@ -56,3 +56,63 @@ There are two serverless applications that interact with the graph database:
5656
* Indexer
5757

5858
Indexer is a serverless application that is connected to a SQS queue that is associated with the live CMR collection ingest/update events. It will index new CMR collection ingest/update into the graph database.
59+
60+
## Explore Indexed Data
61+
CMR graph database is a Neptune database hosted on AWS. Currently, we only index collections and their documentation related urls as vertices in the graph database with edges (named `documents`) from the related url vertices to the collection vertices that reference them.
62+
63+
The collection vertex has the following properties:
64+
65+
* concept-id - Collection concept id
66+
* name - Url to the collection landing page
67+
* title - Entry title of the collection
68+
* doi - DOI link of the collection
69+
70+
The documentation vertex has the following properties:
71+
72+
* name - documentation url
73+
* title - description of the documentation url
74+
75+
### Access via CMR graphdb endpoint
76+
77+
CMR graphdb access endpoint is at: https://cmr.sit.earthdata.nasa.gov/graphdb. Users can use the [Gremlin API](https://tinkerpop.apache.org/gremlin.html) to explore the relationships that are indexed in the graph database. Here are some examples:
78+
79+
To see the total number of vertices in the graph db:
80+
```
81+
curl -XPOST https://cmr.sit.earthdata.nasa.gov/graphdb -d '{"gremlin":"g.V().count()"}'
82+
```
83+
84+
To see the content of the first 10 vertices in the graph db:
85+
```
86+
curl -XPOST https://cmr.sit.earthdata.nasa.gov/graphdb -d '{"gremlin":"g.V().limit(10)"}'
87+
```
88+
89+
To see all collections that share the same documentation URL with the collection (C1233352242-GHRC):
90+
```
91+
curl -XPOST https://cmr.sit.earthdata.nasa.gov/graphdb -d '{"gremlin":"g.V().hasLabel(\"dataset\").has(\"concept-id\", \"C1233352242-GHRC\").inE(\"documents\").outV().hasLabel(\"documentation\").outE(\"documents\").inV().hasLabel(\"dataset\").valueMap()"}'
92+
```
93+
94+
For users have write access to graphdb, they can also add vertices and edges between vertices. For example:
95+
96+
To create a collection vertex:
97+
```
98+
curl -XPOST https://cmr.sit.earthdata.nasa.gov/graphdb -d '{"gremlin":"g.addV(\"dataset\").property(\"name\", \"https://dx.doi.org/undefined\").property(\"title\", \"GPM Ground Validation Precipitation Imaging Package (PIP) ICE POP V1\").property(\"concept-id\", \"C1233352242-GHRC\").property(\"doi\", \"10.5067/GPMGV/ICEPOP/PIP/DATA101\")"}'
99+
```
100+
101+
To create a documentation vertex:
102+
```
103+
curl -XPOST https://cmr.sit.earthdata.nasa.gov/graphdb -d '{"gremlin":"g.addV(\"documentation\").property(\"name\", \"https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20180003615.pdf\").property(\"title\", \"NASA Participation in the International Collaborative Experiments for Pyeongchang 2018 Olympic and Paralympic Winter Games (ICE-POP 2018)\")"}'
104+
```
105+
106+
To create an edge from the above documentation vertex to the collection vertex:
107+
```
108+
curl -XPOST https://cmr.sit.earthdata.nasa.gov/graphdb -d '{"gremlin":"g.V().hasLabel(\"documentation\").has(\"name\", \"https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20180003615.pdf\").addE(\"documents\").to(g.V().hasLabel(\"dataset\").has(\"concept-id\", \"C1233352242-GHRC\"))"}'
109+
```
110+
111+
### Access via SSH tunnel and Gremlin Console locally
112+
For users who have access to AWS Neptune endpoint via an internal jumpbox, they can set up SSH tunnel to the Neptune endpoint and start Gremlin Console locally to connect to the Neptune endpoint. Then, they can use Gremlin console to explore the graph database as if it is local.
113+
114+
Prerequisites: User must have ssh access to the internal jumpbox that has access to Neptune endpoint.
115+
116+
See [this AWS document](https://docs.aws.amazon.com/neptune/latest/userguide/access-graph-gremlin-console.html) on how to set up the Gremlin Console to connect to a Neptune DB instance.
117+
118+
Happy exploring!

0 commit comments

Comments
 (0)