Skip to content

armanawn/DSCI-513_databases-data-retrieval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DSCI 513: Databases and Data Retrieval

How to retrieve data stored relational and non-relational database systems efficiently and effectively, and overview of how to create and modify database objects.

License

© 2022 Arman Seyed-Ahmadi

Software licensed under the MIT License, non-software content licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License. See the license file for more information.

Lectures

Find Panopto lecture recordings here.

# Topic
1 Introduction to databases, the relational model, SQL, basic queries
2 Data types, filtering, derived columns and aliases, functions
3 Aggregations, grouping, joins
4 Creating, modifying, and dropping tables and rows, integrity constraints
5 Transactions, ACID, subqueries, set operations
6 Views, CTEs, window functions, indexing, query optimization
7 Semi-structured data, non-relational databases, basic queries in MongoDB
8 Intro to big data, advanced queries in MongoDB, aggregation pipelines

Lab assignments

There will be one lab assignment per week. We will follow the standard MDS lab deadlines.

Quizzes

Quizzes will be open book, meaning you may consult course materials, online sources, etc. However, communication with other people during the quiz is strictly forbidden. See the MDS quiz instructions here. For the dates/times of the quizzes, see the MDS calendar.

Installation

  • The first step to setup the course's Conda environment is to run conda install -c conda-forge nb_conda_kernels in your base environment, but if you have done so already for another MDS course, skip this step.

  • The Conda environment file for the course is here. To create the environment, run conda env create -f dsci513env.yml (you only need to do this once). Finally, select the dsci513 kernel from within VS Code or JupyterLab.

Course learning objectives

  • Explain and justify the need for storing data in a database
  • Describe tables, tuples, and attributes in the relational model
  • Construct basic and advanced SQL statements to query relational databases
  • Define and manipulate tables and tuples using data definition language
  • Understand the usage of integrity constraints in relational databases
  • Describe the concept of transactions and concurrency control
  • Construct basic and advanced queries in a NoSQL DBMS

Suggested material

Recommended:

More advanced: