Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add better architecture diagram #165

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
<div align="center">

<h4 align="center">
<a href="https://double.cloud/services/doublecloud-transfer/">Double Cloud Transfer</a> |
<a href="./docs/getting_started.md">Documentation</a> |
<a href="./docs/benchmarks.md">Benchmarking</a> |
<a href="./roadmap/roadmap_2024.md">Roadmap</a>
<a href="https://doublecloud.github.io/transfer/">Double Cloud Transfer</a> |
<a href="https://doublecloud.github.io/transfer/docs/getting_started.html">Documentation</a> |
<a href="https://doublecloud.github.io/transfer/docs/benchmarks.html">Benchmarking</a> |
<a href="https://doublecloud.github.io/transfer/docs/roadmap">Roadmap</a>
</h4>


Expand Down Expand Up @@ -186,18 +186,18 @@ More details [here](./docs/deploy_k8s.md).
## ⚡ Performance


[Naive-s3-vs-airbyte](./docs/benchmark_vs_airbyte.md)
[Naive-s3-vs-airbyte](https://medium.com/@laskoviymishka/transfer-s3-connector-vs-airbyte-s3-connector-360a0da084ae)

</div>

![Naive-s3-vs-airbyte](./assets/bench_s3_vs_airbyte.png)
![Naive-s3-vs-airbyte](./docs/_assets/bench_s3_vs_airbyte.png)

<div align="center">

## 📐 Architecture


<img src="./assets/logo.png" alt="transfer" />
<img src="./docs/_assets/architecture.png" alt="transfer" />

</div>

Expand Down
Binary file removed assets/bench_s3_vs_airbyte.png
Binary file not shown.
Binary file added docs/_assets/architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
Binary file added docs/_assets/bench_s3_vs_airbyte.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
28 changes: 24 additions & 4 deletions docs/architecture-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ The white paper is structured as follows:

1. **Introduction**: An overview of the purpose and goals of the system.
2. **Systems Overview**: A brief overview of current systems that require different approaches for data synchronization.
3. **Replication Techniques**: Description of the main replication techniques and application scenarios.
3. **Architecture Overviw**: High level principles around {{ data-transfer-name }} architecture.
4. **Replication Techniques**: Description of the main replication techniques and application scenarios.
4. **Data Integrity**: Discussion of how to achieve data integrity across different types of storages and possible design decisions.
5. **Challenges**: Description of the main challenges encountered while building the system as a service.
6. **Case Studies**: In-depth analysis of some case studies where the system was used and how it helped.
Expand Down Expand Up @@ -82,8 +83,6 @@ The first step in overcoming the above challenges was to formulate the requireme

After careful analysis, we crystallized our requirements for the future **{{ data-transfer-name }}** product as follows:



* **Minimize Delivery Lag**: The system must guarantee that data is delivered with a freshness lag of only a few seconds to be considered useful.
* **Guarantee Quality of Data**: The system must provide data with inferred schema from source tables and guarantee consistency between storages, with eventual consistency being acceptable.
* **Serializable Intermediate Format**: The system must provide a uniform intermediate format to transport data, with the option to route traffic through a persistent queue.
Expand All @@ -108,11 +107,32 @@ To minimize development efforts and system complexity, we must have some univers

Many middlewares exist between the source and sink for metrics collection, transformers application and logging.

Here’s a draft for your architecture overview in the same style as the linked article:

---

## Architecture Overview

The system is built around a **core module** that acts as the central part of the application, managing its internal logic and facilitating communication between components. Users can interact with the system through either a **Command-Line Interface (CLI)** or via a **Component Development Kit (CDK)**, which serves as a library of interfaces for embedding functionality into external systems.

*Architecture*:
![alt_text](_assets/architecture.png "image_tooltip")


At its heart, the application follows a **plugin-based architecture**, enabling extensibility and modularity.
Plugins are integrated at **compile time** as Go dependencies, ensuring tight integration and optimal performance.
The system is implemented as a **Go monolith**, providing a streamlined and cohesive runtime environment.

The core connects to **plugins** in various domains, such as **connectors** (e.g., S3, PostgreSQL, ClickHouse), or **transformers** (e.g., renaming or SQL transformations).
These plugins are further glued together by **middlewares**, enabling data processing and transformations to be seamlessly chained.
A shared **data model** ensures consistent communication between components, while **connectors** handle all database specific logic, **transformers** do computations based on shared **data model** and a **coordinator** manage and state tracking and coordination between nodes of **{{data-transfer-name}}** deployments.

This modular approach allows the system to remain flexible, robust, and scalable while adhering to Go's principles of simplicity and high performance.

*Dataplane overview*:

![alt_text](_assets/dp_architecture.png "image_tooltip")


We must handle each delivery as a separate entity or resource to be able to configure it in a centralized way. This realization led us to make the runtime engine pluggable to use any IAS cloud provider or container management service (like <span style="text-decoration:underline;">k8s</span>). Each runtime here is a simple stateless worker executor running a job binary with provided options.

The Data plane can track the status of a job and process commands to the job from our coordinator service.
Expand Down
13 changes: 6 additions & 7 deletions docs/benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ This guide outlines the steps to benchmark database transfer services using a ro

2. **Prepare the Source Database**
- Set up a source database on ec2 instance.
- Use a pre-production serverless runtime environment.

3. **Load the Data**
Import the prepared dataset into the source database, ensuring it aligns with the benchmarking scenario.
Expand All @@ -35,7 +34,7 @@ Baselines provide reference points to measure performance.
- Document the metrics from this initial transfer.
- Key metric: Rows per second for single-core throughput.
- Example: Measure total transfer time or use metrics like rows/sec.
- ![bench_key_metrics.png](../assets/bench_key_metrics.png)
- ![bench_key_metrics.png](_assets/bench_key_metrics.png)

---

Expand All @@ -45,22 +44,22 @@ After setting baselines, fine-tune the transfer settings for better performance.

### Optimization Steps:
1. **Activate the Transfer**
Deploye transfer via [helm](./deploy_k8s.md) in your k8s cluster.
Deploye transfer via [helm](deploy_k8s.html) in your k8s cluster.

2. **Expose pprof for Profiling**
- Expose the pprof port for profiling, by default `--run-profiler` is true.

3. **Download the pprof File**
- CPU profiles are accessible at `http://localhost:{EXPOSED_PORT}/debug/pprof/`.
- ![bench_key_metrics.png](../assets/bench_pprof_lens.png)
- ![bench_key_metrics.png](_assets/bench_pprof_lens.png)
- Profiles typically sample for 30 seconds.
- ![bench_key_metrics.png](../assets/bench_pprof_prifle.png)
- ![bench_key_metrics.png](_assets/bench_pprof_prifle.png)

4. **Visualize the Profile**
- Use tools like [Speedscope](https://www.speedscope.app/).
- ![bench_key_metrics.png](../assets/bench_speedscope_init.png)
- ![bench_key_metrics.png](_assets/bench_speedscope_init.png)
- Upload the profile to analyze call stacks.
- ![bench_key_metrics.png](../assets/bench_results.png)
- ![bench_key_metrics.png](_assets/bench_results.png)
- Use the "Left-Heavy" view to identify high-time-consuming paths.

---
Expand Down
10 changes: 10 additions & 0 deletions docs/index.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,16 @@ description:
- >-
It supports several data transfer scenarios, with every scenario run at the logical level. This allows you to keep your source database running and
minimize the downtime of applications that use the service.
- >-
At its heart, the application follows a <strong>plugin-based architecture</strong>, enabling extensibility and modularity.
<img style="width: 100%" src="_assets/architecture.png" alt="alt_text" title="image_tooltip">
<br/>
The system is built around a <strong>core module</strong> that acts as the central part of the application,
managing its internal logic and facilitating communication between components.
Users can interact with the system through either a <strong>Command-Line Interface (CLI)</strong> or via a
<strong>Component Development Kit (CDK)</strong>,
which serves as a library of interfaces for embedding functionality into external systems.
meta:
title: "{{product-name}}"
links:
Expand Down
11 changes: 11 additions & 0 deletions docs/roadmap/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
title: "Transfer connectors"
description: "Explore the list of {{ data-transfer-name }} connectors in {{ DC }} and see their usage in different transfer types."
---

# Transfer roadmaps

Transfer aim to have publicly vidible roadmaps for future improvements, here is a list of active and past roadmaps.

* [{#T}](roadmap_2024.md)
* [{#T}](roadmap_2025.md)
56 changes: 56 additions & 0 deletions docs/roadmap/roadmap_2024.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Roadmap 2024

## Key Goals

1. **K8s Operator for Multi-Transfer Deployments**
2. **Delta Sink**
3. **Iceberg Sink**
3. **Clickhouse Exactly Once Support**

---

## 1. E2E Testing for Main Connectors

### Objective:
Set up comprehensive **end-to-end tests** in the CI pipeline for the following main connectors:
- **Postgres**
- **MySQL**
- **Clickhouse**
- **Yandex Database (YDB)**
- **YTsaurus (YT)**

### Steps:
- [x] Configure test environments in CI for each connector.
- [x] Design E2E test scenarios covering various transfer modes (snapshot, replication, etc.).
- [x] Automate test execution for all supported connectors.
- [x] Set up reporting and logs for test failures.

### Milestone:
Achieve **fully automated E2E testing** across all major connectors to ensure continuous integration stability.

---

## 2. Helm Deployment Documentation

### Objective:
Provide detailed documentation on deploying the transfer engine using **Helm** on Kubernetes clusters.

### Steps:
- [x] Create Helm chart for easy deployment of the transfer engine.
- [x] Write comprehensive **Helm deployment guide**.
- [x] Define key parameters for customization (replicas, resources, etc.).
- [x] Instructions for various environments (local, cloud).
- [x] Test Helm deployment process on common platforms (GKE, EKS, etc.).

### Milestone:
Enable seamless deployment of the transfer engine via Helm with clear and accessible documentation.

---

## Summary

- **Q2-Q3**: Focus on **E2E testing** for core connectors.
- **Q3**: Publish **Helm deployment** documentation and final testing.
- **Q3-Q4**: Develop and release the **Kubernetes operator** for multi-transfer management.

This roadmap aims to enhance testing, simplify deployment, and provide advanced scalability options for the transfer engine.
30 changes: 30 additions & 0 deletions docs/roadmap/roadmap_2025.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Roadmap 2025

## Key Goals

1. **K8s Operator for Multi-Transfer Deployments**
2. **Delta Sink**
3. **Iceberg Sink**
3. **Clickhouse Exactly Once Support**

---

## 1. Kubernetes Operator for Multi-Transfer Deployments

### Objective:
Develop a **Kubernetes operator** to manage multiple data transfers, simplifying the process for large-scale environments.

### Steps:
- [ ] Define CRD (Custom Resource Definitions) for transfer configurations.
- [ ] Implement operator logic for scaling and managing multi-transfer deployments.
- [ ] Add support for monitoring, scaling, and error recovery.
- [ ] Write user documentation for deploying and managing transfers via the operator.

### Milestone:
Provide a scalable solution for managing multiple data transfers in Kubernetes environments with an operator.

---

## Summary

TODO
10 changes: 10 additions & 0 deletions docs/toc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,17 @@ items:
- name: Connect Prometheus to Transfer
href: integrations/connect-prometheus-to-transfer.md

- name: Plans
items:
- name: Overview
href: roadmap/index.md
- name: "Roadmap 2024"
href: roadmap/roadmap_2024.md
- name: "Roadmap 2025"
href: roadmap/roadmap_2025.md
- name: Resolve issues with Transfer
href: transfer-self-help.md
- name: Questions and answers
href: transfer-faq.md
- name: Benchmarking
href: benchmarks.md
71 changes: 0 additions & 71 deletions roadmap/roadmap_2024.md

This file was deleted.

Loading