Skip to content

oeg-upm/pycottas

 
 

Repository files navigation

pycottas

License DOI Latest PyPI version Python Version PyPI status Documentation Status

pycottas is a library for working with compressed RDF files in the COTTAS format. COTTAS stores triples as a triple table in Apache Parquet. It is built on top of DuckDB and provides an HDT-like interface.

Features ✨

  • Compression and decompression of RDF files.
  • Querying COTTAS files with triple patterns.
  • RDFLib backend for querying COTTAS files with SPARQL.
  • Supports RDF datasets (quads).
  • Can be used as a library or via command line.

Documentation 📑

Read the documentation.

Getting Started 🚀

PyPI is the fastest way to install pycottas:

pip install pycottas

We recommend to use virtual environments to install pycottas.

import pycottas
from rdflib import Graph, URIRef

pycottas.rdf2cottas('my_file.ttl', 'my_file.cottas', index='spo')
res = pycottas.search('my_file.cottas', '?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?o')
print(res)
pycottas.cottas2rdf('my_file.cottas', 'my_file.nt')

# COTTASDocument class for querying with triple patterns
cottas_doc = pycottas.COTTASDocument('my_file.cottas')
# It is possible to create a document from multiple COTTAS files matching a glob pattern
cottas_doc = pycottas.COTTASDocument('test/*.cottas')
# the triple pattern can be a string or a tuple
res = cottas_doc.search('?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?o')
# limit and offset are optional
res = cottas_doc.search((None, URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'), None), limit=10, offset=20)
print(res)

# COTTASStore class for querying with SPARQL
graph = Graph(store=pycottas.COTTASStore("my_file.cottas"))
res = graph.query("""
  PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
  SELECT DISTINCT ?s ?o WHERE {
    ?s rdf:type ?o .
  } LIMIT 10""")
for row in res:
    print(row)

To execute via command line check the docs.

License 🔓

pycottas is available under the Apache License 2.0.

Author & Contact 📬

Universidad Politécnica de Madrid.

About

Python COTTAS library for compressing and querying RDF

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%