-
Notifications
You must be signed in to change notification settings - Fork 242
docs(DWH): feature branch for DWH #4766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
SamyOubouaziz
wants to merge
38
commits into
main
Choose a base branch
from
int-feat-dwh-feature-branch
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,145
−0
Open
Changes from all commits
Commits
Show all changes
38 commits
Select commit
Hold shift + click to select a range
8131d17
docs(DWH): feature branch for DWH
SamyOubouaziz 83cd01f
docs(DWH): update
SamyOubouaziz 14ce174
docs(srv): update
SamyOubouaziz 9fd0a73
docs(dwh): update
SamyOubouaziz ec98dad
docs(dwh): update
SamyOubouaziz e2fe40a
docs(dwh): update
SamyOubouaziz 5a4cf49
docs(dwh): update
SamyOubouaziz 6615dd8
feat(dwh): add how to create
SamyOubouaziz b0f318b
feat(dwh): update file titles
SamyOubouaziz cf2cf25
feat(dwh): add concepts MTA-5798
SamyOubouaziz 7fc335f
feat(dwh): add connect applications MTA-5800
SamyOubouaziz 38ebab6
feat(dwh): add connect applications MTA-5800
SamyOubouaziz c912c03
feat(dwh): add connect BI tools page MTA-5801
SamyOubouaziz 17cce23
feat(dwh): update
SamyOubouaziz b43ee82
feat(dwh): update
SamyOubouaziz 012c8a1
feat(dwh): update
SamyOubouaziz 2c7295a
feat(dwh): update
SamyOubouaziz f71b76f
feat(dwh): update
SamyOubouaziz a96e339
feat(dwh): update
SamyOubouaziz 4d7db47
feat(dwh): update
SamyOubouaziz 5b2a041
feat(dwh): update
SamyOubouaziz 53f88c8
feat(dwh): update
SamyOubouaziz 9494631
feat(dwh): update
SamyOubouaziz 432d712
Apply suggestions from code review
SamyOubouaziz c261553
feat(dwh): update
SamyOubouaziz 27a5b4f
feat(dwh): update
SamyOubouaziz 43f975d
feat(dwh): update
SamyOubouaziz 37da511
feat(dwh): update
SamyOubouaziz 8eb83a2
feat(dwh): update
SamyOubouaziz 8054f3d
feat(dwh): update
SamyOubouaziz c87698a
feat(dwh): update
SamyOubouaziz 0abc1f4
feat(dwh): update
SamyOubouaziz 0affa91
feat(dwh): update
SamyOubouaziz 534154c
feat(dwh): update
SamyOubouaziz 5c6b0ea
feat(dwh): update
SamyOubouaziz 1255c34
feat(dwh): hardcore menu update
SamyOubouaziz 6694627
feat(dwh): update menu
SamyOubouaziz 7c0d9e9
Merge branch 'main' into int-feat-dwh-feature-branch
SamyOubouaziz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
--- | ||
meta: | ||
title: Data Warehouse for ClickHouse® - Concepts | ||
description: Understand key concepts behind Scaleway's Data Warehouse for ClickHouse®. | ||
content: | ||
h1: Data Warehouse for ClickHouse® - Concepts | ||
paragraph: Understand key concepts behind Scaleway's Data Warehouse for ClickHouse®. | ||
tags: data warehouse clickhouse concepts glossary terms definitions | ||
dates: | ||
published: 2025-05-07 | ||
validation: 2025-06-03 | ||
categories: | ||
- data-warehouse | ||
--- | ||
|
||
## Autoscaling | ||
|
||
Autoscaling refers to the ability of a Data Warehouse for ClickHouse® deployment to automatically adjust the number of instances without manual intervention. | ||
Scaling mechanisms ensure that resources are provisioned dynamically to handle incoming requests efficiently while minimizing idle capacity and cost. | ||
|
||
## Bottomless storage | ||
|
||
Bottomless storage is a feature that allows ClickHouse® to separate compute and storage by offloading data to external object storage, such as Amazon S3-compatible services, while keeping frequently accessed data cached locally for fast queries. In ClickHouse®, this is implemented by transparently moving older or less-used data to remote storage, enabling virtually unlimited storage capacity without sacrificing performance for active workloads. | ||
|
||
Refer to the official [ClickHouse® documentation](https://clickhouse.com/docs/guides/separation-storage-compute) for more information. | ||
|
||
## ClickHouse® | ||
|
||
ClickHouse® is a high-performance, column-oriented, distributed database management system designed for real-time analytics. It is optimized for handling large volumes of data with fast query performance, making it ideal for applications requiring up-to-date insights. ClickHouse® stores data in a columnar format, which reduces I/O operations and speeds up query execution. It supports distributed processing across multiple nodes, enabling horizontal scaling and fault tolerance through replication. ClickHouse® provides a powerful SQL interface and offers advanced features like real-time data ingestion, compression, and indexing, making it a robust solution for analytical workloads. | ||
|
||
## ClickHouse® HTTP console | ||
|
||
The ClickHouse® HTTP console is a web interface that allows you to interact easily with your Data Warehouse deployment from the Scaleway console by executing SQL queries. | ||
|
||
<Message type="note"> | ||
Make sure to enter valid credentials in the top-right fields before executing queries. | ||
</Message> | ||
|
||
## Column-oriented storage | ||
|
||
ClickHouse® stores data in a column-oriented format, which significantly optimizes read performance for analytical queries. By storing data in columns rather than rows, ClickHouse® reduces the number of I/O operations needed during query execution, as it only reads the necessary columns from disk. | ||
|
||
## Compression | ||
|
||
ClickHouse® uses advanced compression algorithms to reduce storage requirements and improve query performance by minimizing data transfer. Compression not only helps in saving disk space but also accelerates data retrieval and processing by reducing the amount of data that needs to be read from storage and transferred over the network. | ||
|
||
## Distributed processing | ||
|
||
ClickHouse® supports distributed processing across multiple nodes, allowing it to handle extremely large datasets efficiently and scale horizontally. This architecture enables ClickHouse® to distribute data and queries across a cluster, improving performance and reliability by leveraging the combined resources of all nodes. | ||
|
||
## Horizontal scaling | ||
|
||
Horizontal scaling refers to the process of adding more nodes to the cluster to increase its capacity and performance. This approach allows the cluster to handle larger datasets and higher query loads by distributing the data and processing tasks across additional nodes. Data Warehouse for ClickHouse® deployments [scale automatically](#autoscaling) according to the incoming workload. | ||
|
||
## Indexing | ||
|
||
ClickHouse® employs various indexing techniques, such as primary key and skip indexes, to speed up query execution and data retrieval. The primary key index allows for efficient point lookups and range queries, while skip indexes help in quickly skipping over large chunks of data that do not match query conditions, thus reducing the overall query time. | ||
|
||
## Node | ||
|
||
In the context of a distributed Data Warehouse for ClickHouse® cluster (also called "deployment"), a node refers to an individual instance that stores and processes a portion of the data. Each node participates in data distribution, query execution, and replication to ensure balanced load, fault tolerance, and high availability. Nodes communicate with each other to coordinate tasks, execute queries in parallel, and maintain synchronized data replicas. They are configured with specific settings to define their roles and manage resources, allowing the cluster to scale and perform efficiently. | ||
|
||
## Replica set | ||
|
||
A replica set consists of multiple nodes that store identical copies of the same data. This setup ensures fault tolerance and high availability by providing redundancy. If one node in the replica set fails, another node can take over, ensuring continuous data access and processing. ClickHouse® automatically handles data replication and failover, making it a reliable solution for mission-critical applications. | ||
|
||
## SQL support | ||
|
||
ClickHouse® provides a powerful SQL interface, enabling users to perform complex queries and data manipulations using familiar SQL syntax. This extensive SQL support includes a wide range of functions and features, such as subqueries, window functions, and user-defined functions, making it accessible to both analysts and developers. | ||
|
||
|
||
## Vertical scaling | ||
|
||
Vertical scaling refers to the process of increasing the resources of individual nodes within the cluster. Vertical scaling enhances the performance and capacity of individual nodes, allowing them to handle larger datasets and more complex queries more efficiently. Vertical scaling is often used in conjunction with [horizontal scaling](#horizontal-scaling) to optimize performance and resource utilization in a Data Warehouse for ClickHouse® deployment. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
--- | ||
meta: | ||
title: Data Warehouse for ClickHouse® FAQ | ||
description: Discover Scaleway Data Warehouse for ClickHouse® and find answers to general questions. | ||
content: | ||
h1: Data Warehouse for ClickHouse® | ||
dates: | ||
validation: 2025-06-03 | ||
category: managed-services | ||
productIcon: DataWarehouseProductIcon | ||
--- | ||
|
||
## What is Scaleway Data Warehouse for ClickHouse®? | ||
|
||
Scaleway Data Warehouse for ClickHouse® allows you to perform queries and analytics on structured datasets, up to petabytes, without managing any infrastructure. | ||
|
||
Data Warehouse for ClickHouse® supports SQL queries and and Apache Spark™-compatible libraries, and can be integrated with data analysis and storage tools like Amazon S3-compatible storage, data visualization software, BI tools, or ETL solutions. | ||
|
||
Scaleway handles seamless scaling, failover, settings and backups administration to let you focus on building your application. | ||
|
||
Refer to the [Data Warehouse for ClickHouse® documentation](/data-warehouse/) for more information on the product and its features. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,208 @@ | ||
--- | ||
meta: | ||
title: How to connect to your deployment | ||
description: Learn how to connect yourself or your applications to your Scaleway Data Warehouse for ClickHouse® deployment. | ||
content: | ||
h1: How to Connect Applications to Your Deployment | ||
paragraph: Learn how to connect yourself or your applications to your Scaleway Data Warehouse for ClickHouse® deployment. | ||
tags: connect applications deployment data warehouse clickhouse | ||
dates: | ||
validation: 2025-06-03 | ||
posted: 2025-06-03 | ||
categories: | ||
- data-warehouse | ||
--- | ||
|
||
This page explains how to connect yourself or your applications to your Data Warehouse for ClickHouse® deployment using the [Scaleway console](https://console.scaleway.com). | ||
|
||
To connect your deployment with BI tools, refer to the [dedicated documentation](/data-warehouse/how-to/connect-bi-tools/). | ||
|
||
<Macro id="requirements" /> | ||
|
||
- A Scaleway account logged into the [console](https://console.scaleway.com) | ||
- [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization | ||
- [Signed up to the private beta](https://www.scaleway.com/fr/betas/) and received a confirmation email. | ||
- Created a [Data Warehouse deployment](/data-warehouse/how-to/create-deployment/) | ||
|
||
1. Click **ClickHouse®** under **Data & Analytics** on the side menu. The Data Warehouse deployment page displays. | ||
|
||
2. Click the name of the Data Warehouse deployment you want to connect to. The overview tab of the deployment displays. | ||
|
||
3. Click the **Actions** button in the top-right corner of the page. A drop-down menu displays. | ||
|
||
4. Select **Connect using frameworks**. The connection wizard displays. | ||
|
||
<Message type="note"> | ||
To connect your deployment with BI tools, refer to the [dedicated documentation](/data-warehouse/how-to/connect-bi-tools/). | ||
</Message> | ||
|
||
5. Select your preferred framework: | ||
|
||
**Protocols** | ||
|
||
Select the appropriate protocol, then run the displayed command in a terminal. Remember to replace the placeholders with the appropriate values, and to specify the correct path to the certificate file. | ||
|
||
<Tabs id="data-warehouse-connect-protocols"> | ||
<TabsTab label="ClicHouse® CLI"> | ||
```sh | ||
clickhouse client \ | ||
--host <YOUR_DEPLOYMENT_ID>.dtwh.<REGION>.scw.cloud \ | ||
--port 9440 \ | ||
--secure \ | ||
--user scwadmin \ | ||
--password '<PASSWORD>' | ||
``` | ||
</TabsTab> | ||
<TabsTab label="MySQL"> | ||
```sh | ||
mysql -h <YOUR_DEPLOYMENT_ID>.dtwh.<REGION>.scw.cloud \ | ||
-P 9004 \ | ||
-u scwadmin \ | ||
--password='<PASSWORD>' -e "SELECT 1;" | ||
``` | ||
<Message type="note"> | ||
MySQL connection is exposed publicly. Use ClickHouse® CLI for a secure connection. | ||
</Message> | ||
</TabsTab> | ||
<TabsTab label="HTTPS"> | ||
```sh | ||
echo 'SELECT 1' | curl 'https://scwadmin:<PASSWORD>@<YOUR_DEPLOYMENT_ID>.dtwh.<REGION>.scw.cloud:443' -d @- | ||
``` | ||
<Message type="note"> | ||
`curl` only works with SQL queries, and does not allow direct connection to your Data Warehouse for ClickHouse® deployment. | ||
</Message> | ||
</TabsTab> | ||
</Tabs> | ||
<br/> | ||
**Languages** | ||
|
||
Select the desired language, then run the code displayed to create a file that connects to your deployment, and run queries programmatically. Remember to replace the placeholders with the appropriate values, and to specify the correct path to the certificate file. | ||
|
||
<Tabs id="data-warehouse-connect-languages"> | ||
<TabsTab label="Python"> | ||
```python | ||
pip install clickhouse-connect | ||
cat <<EOF >clickhouse.py | ||
import clickhouse_connect | ||
|
||
client = clickhouse_connect.get_client( | ||
host="<YOUR_DEPLOYMENT_ID>.dtwh.<REGION>.scw.cloud", | ||
port=443, | ||
username="scwadmin", | ||
password="<PASSWORD>", | ||
) | ||
query_result = client.query("SELECT 1") | ||
print(query_result.result_set) | ||
EOF | ||
python clickhouse.py | ||
``` | ||
</TabsTab> | ||
<TabsTab label="Go"> | ||
```go | ||
mkdir ClickHouse-go | ||
cd ClickHouse-go | ||
go mod init ClickHouse-go | ||
cat <<EOF >main.go | ||
package main | ||
|
||
import ( | ||
"context" | ||
"fmt" | ||
"log" | ||
"github.com/ClickHouse/clickhouse-go/v2" | ||
) | ||
|
||
func main() { | ||
conn, err := clickhouse.Open(&clickhouse.Options{ | ||
Addr: []string{"f133556f-8578-486f-be7f-49f7da08b728.dtwh.fr-par.scw.cloud:9440"}, | ||
Auth: clickhouse.Auth{ | ||
Database: "default", | ||
Username: "scwadmin", | ||
Password: "PASSWORD", | ||
}, | ||
}) | ||
if err != nil { | ||
log.Fatal(err) | ||
} | ||
defer conn.Close() | ||
|
||
ctx := context.Background() | ||
var result string | ||
if err := conn.QueryRow(ctx, "SELECT 1").Scan(&result); err != nil { | ||
log.Fatal(err) | ||
} | ||
fmt.Println(result) | ||
} | ||
EOF | ||
go run . | ||
``` | ||
</TabsTab> | ||
<TabsTab label="Node.js"> | ||
```node | ||
npm i @clickhouse/client | ||
cat <<EOF >clickhouse.js | ||
import { createClient } from '@clickhouse/client' | ||
|
||
void (async () => { | ||
const client = createClient({ | ||
url: 'https://f133556f-8578-486f-be7f-49f7da08b728.dtwh.fr-par.scw.cloud:443', | ||
username: 'scwadmin', | ||
password: 'PASSWORD' | ||
}) | ||
const rows = await client.query({ | ||
query: 'SELECT 1', | ||
format: 'JSONEachRow', | ||
}) | ||
console.info(await rows.json()) | ||
await client.close() | ||
})() | ||
EOF | ||
node clickhouse.js | ||
``` | ||
</TabsTab> | ||
<TabsTab label="Java"> | ||
```java | ||
cat <<EOF >pom.xml | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<project | ||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd" | ||
xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> | ||
<modelVersion>4.0.0</modelVersion> | ||
<groupId>clickhouse-java</groupId> | ||
<artifactId>clickhouse-java</artifactId> | ||
<version>1.0.0</version> | ||
<packaging>jar</packaging> | ||
<dependencies> | ||
<dependency> | ||
<groupId>com.clickhouse</groupId> | ||
<artifactId>clickhouse-jdbc</artifactId> | ||
<version>0.7.2</version> | ||
</dependency> | ||
</dependencies> | ||
</project> | ||
EOF | ||
mkdir -p src/main/java | ||
cat <<EOF >src/main/java/Main.java | ||
import java.sql.*; | ||
import java.lang.ClassNotFoundException; | ||
|
||
public class Main { | ||
public static void main(String[] args) throws SQLException, ClassNotFoundException { | ||
Class.forName("com.clickhouse.jdbc.ClickHouseDriver"); | ||
String url = "jdbc:ch://scwadmin:PASSWORD@f133556f-8578-486f-be7f-49f7da08b728.dtwh.fr-par.scw.cloud:443/default?ssl=true"; | ||
try (Connection con = DriverManager.getConnection(url); Statement stmt = con.createStatement()) { | ||
ResultSet result_set = stmt.executeQuery("SELECT 1 AS one"); | ||
while (result_set.next()) { | ||
System.out.println(result_set.getInt("one")); | ||
} | ||
} | ||
} | ||
} | ||
EOF | ||
mvn clean compile | ||
mvn exec:java -Dexec.mainClass=Main | ||
``` | ||
</TabsTab> | ||
</Tabs> | ||
|
||
You are now connected to your Data Warehouse for ClickHouse® deployment using the administrator account. |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.