Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/Specification API #4

Merged
merged 6 commits into from
Feb 11, 2025
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 26 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ In order to build and test the software outside of Docker, you will need

You can run the API locally by running either `make compose-up` or `docker compose up -d --build`.

The docker compose setup runs the S3 locally using Localstack as well as the API. An S3 bucket called local-collection-data is created and seeded with example issue log data.
The docker compose setup runs the S3 locally using Localstack as well as the API. An S3 bucket called local-collection-data is created and seeded with example files in the collection-data directory.


## Swagger UI
Expand Down Expand Up @@ -69,3 +69,28 @@ Request for issues for a specific dataset and resource:
curl http://localhost:8000/log/issue?dataset=border&resource=4a57239e3c1174c80b6d4a0278ab386a7c3664f2e985b2e07a66bbec84988b30&field=geometry
```

### provision_summary endpoint

can be accessed via
```
http://localhost:8000/performance/provision_summary?organisation=local-authority:LBH&offset=50&limit=100
```

Optional Parameters:
* Offset
* Limit
* Organisation
* Dataset


### specification endpoint

can be accessed via
```
http://localhost:8000/specification/specification?offset=0&limit=10
```

Optional Parameters:
* Offset
* Limit
* Dataset
33 changes: 28 additions & 5 deletions src/db.py

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these SQL queries need parameterising?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes would definitely prevent SQL Injections. I will make the change.

Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from schema import IssuesParams, ProvisionParams, SpecificationsParams
from pagination_model import PaginationParams, PaginatedResult
from config import config

import json

logger = get_logger(__name__)

Expand Down Expand Up @@ -101,12 +101,23 @@ def get_specification(params: SpecificationsParams):
pagination = f"LIMIT {params.limit} OFFSET {params.offset}"

where_clause = ""

if params.dataset:
where_clause += _add_condition(where_clause, f"dataset = '{params.dataset}'")
where_clause += _add_condition(
where_clause,
f"TRIM(BOTH '\"' FROM json_extract(json(value), '$.dataset')) = '{params.dataset}'",
)

sql_count = f"SELECT COUNT(*) FROM '{s3_uri}' {where_clause}"
sql_count = f"""
SELECT COUNT(*) FROM (SELECT unnest(CAST(json AS VARCHAR[])) AS value
FROM '{s3_uri}') as parsed_json {where_clause} {pagination}
"""
logger.debug(sql_count)
sql_results = f"SELECT * FROM '{s3_uri}' {where_clause} {pagination}"
sql_results = f"""
SELECT value as json FROM
(SELECT unnest(CAST(json AS VARCHAR[])) AS value FROM '{s3_uri}') AS parsed_json
{where_clause} {pagination}
"""
logger.debug(sql_results)

with duckdb.connect() as conn:
Expand All @@ -118,14 +129,26 @@ def get_specification(params: SpecificationsParams):
).fetchall()
)
logger.debug(conn.execute("FROM duckdb_secrets();").fetchall())

count = conn.execute(sql_count).fetchone()[
0
] # Count is first item in Tuple
results = conn.execute(sql_results).arrow().to_pylist()

# Extract and parse the JSON field
json_results = []
for item in results:
if "json" in item and isinstance(item["json"], str):
try:
parsed_json = json.loads(item["json"])
json_results.append(parsed_json)
except json.JSONDecodeError:
logger.warning(f"Invalid JSON format in row: {item['json']}")

return PaginatedResult(
params=PaginationParams(offset=params.offset, limit=params.limit),
total_results_available=count,
data=results,
data=json_results,
)
except Exception as e:
logger.exception(
Expand Down
27 changes: 25 additions & 2 deletions tests/integration/test_main.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,31 @@ def test_specification(s3_bucket):

response_data = response.json()
assert "X-Pagination-Total-Results" in response.headers
assert response.headers["X-Pagination-Total-Results"] == str(16)
assert response.headers["X-Pagination-Total-Results"] == str(36)
assert response.headers["X-Pagination-Limit"] == "8"

assert len(response_data) > 0
assert response_data[0]["name"] == "Article 4 direction"


def test_specification_with_dataset(s3_bucket):
# Prepare test params
params = {
"offset": 0,
"limit": 8,
"dataset": "article-4-direction-area",
}

response = client.get("/specification/specification", params=params)

# Validate the results from the search
assert response.status_code == 200

response_data = response.json()
assert "X-Pagination-Total-Results" in response.headers
assert response.headers["X-Pagination-Total-Results"] == str(1)
assert response.headers["X-Pagination-Limit"] == "8"

assert len(response_data) > 0
assert response_data[0]["dataset"] == "article-4-direction-area"
assert response_data[0]["fields"]
assert len(response_data[0]["fields"]) > 1