This program is designed to fit test data to the closest ideal functions based on training data using least squares deviation. It involves database operations, data processing, visualization, and unit testing.
The project uses SQLite for database management, SQLAlchemy for ORM, pandas for data manipulation, and matplotlib for visualization.
- Database Setup:
- Automatically creates and populates tables for training, ideal, and test datasets.
- Data Processing:
- Identifies best-fit ideal functions using least squares method.
- Maps test data points to the selected ideal functions with deviation threshold (√2 * max_deviation).
- Visualization:
- Generates plots comparing training data with ideal functions.
- Visualizes test data mapping, deviations, and residual errors.
- Saves plots as PNG files in
Output
folder.
- Unit Tests:
- Validates database operations and mathematical calculations.
- Clone the repository:
git clone https://github.com/AmirKrichen/FunctionFitting
- Change directory:
cd FunctionFitting
- Install dependencies:
py -m pip install -r requirements.txt
.
├── data/ # Input CSV files
│ ├── ideal.csv
│ ├── test.csv
│ └── train.csv
│
├── database/
│ ├── models.py # Database ORM models
│ └── database_setup.py # Data insertion logic
│
├── ops_viz/
│ ├── data_processing.py # Analysis algorithms
│ └── visualizations.py # Plot generation
│
├── tests/
│ ├── test_database_setup.py # Database insertion unit tests
│ └── test_data_processing.py # Algorithm validation tests
│
├── Output/ # Generated PNGs visualization
│
├── main.py # main script
├── database.db # Generated database
└── requirements.txt
-
Initial Step
- Replace CSV files in
data
folder
- Replace CSV files in
-
Testing
- Run unit tests to test functionality:
py -m unittest discover
-
Run the main script
python main.py
- This will:
- Insert data from CSV files into the database.
- Process the data to map test points to ideal functions.
- Generate and save visualizations in the Output folder.
- Ensure write permissions for
Output
folder before running the main script. - The database (database.db) is automatically reset when running main.py.
- Deviation thresholds are calculated as ideal_max_dev * sqrt(2).
- Developed by: Amir Krichen
- Course: DLMDSPWP01 – Programming with Python