CSC3170 Course Project

Project Overall Description

This is our implementation for the course project of CSC3170, 2022 Fall, CUHK(SZ). For details of the project, you can refer to project-description.md. In this project, we will utilize what we learned in the lectures and tutorials in the course, and implement either one of the following major job:

Application with Database System(s)
Implementation of a Database System

Team Members

Our team consists of the following members, listed in the table below (the team leader is shown in the first row, and is marked with 🚩 behind his/her name):

Student ID	Student Name	GitHub Account (in Email)	GitHub Username
121090001	安子航 🚩	2284874018@qq.com	@i-cookie
121090184	侯天赐	enderturtle@foxmail.com	@EnderturtleOrz
121020163	沈驰皓	stevenshen3641@outlook.com	@StevenShen3641
121090519	涂喻钊	121090519@link.cuhk.edu.cn	@tyzzzzzzzzz
121090628	夏禹扬	2467925095@qq.com	@xqbf
121090841	郑莹琪	121090841@link.cuhk.edu.cn	@Aurora121090841

Project Specification

After thorough discussion, our team made the choice and the specification information is listed below:

Our option choice is: Option 3

Project Abstract

This project writes a miniature relational database management system (DBMS) that stores data tables containing labeled information columns. The project consists of the language system and the version control system. In language system, we defined the data definition language (DDL) and data manipulation language (DML) and wrote the DDL interpreter and DML in the java language compiler to interpreting users' input and dealing with data in tables. The version control system is standard practice for maintaining a project and tracking it from inception to finalization. In addition, version control is a software engineering technique to ensure that the same program files edited by different people are synchronized during the software development process, which play an essential role in a such multi-person cooperative project. We will only deal with tiny databases for this project, so we will not consider too much about speed and efficiency. But we will still consider part of the efficiency improvement when designing the DBMS. Here's what we implemented in this system:

Basic coding:

Filling the code templates provided by UCB.

Advance coding:

Take data type (int/double/string) into consideration while creating the table and doing other operations;
Asterisk symbol '*';
Rename the columns;
Implement the operations including commit, rollback;
Implement aggregate functions including max(), min(), avg(), sum(), round(), count();
Implement additional keyword including as, like, between, where (not) in, order by, group by, primary key;
Version Control: Use snapshot strategy with SHA-1 as version name and trie as version tree;
Application: re-implement Assignment 2.

Set Up Instruction

Prerequisites

jdk >= 17
Make >= 4.2.1

Compile the Project

$ git clone https://github.com/CSC3170-2022Fall/project-database-messing-system.git
$ cd project-database-messing-system
$ make default

Run the Project

$ java db61b.Main

Custom Tests

This project is configured with test cases from CSC3170-2022Fall Assignment2. A modified version of assignment description is provided here.

The solution should be stored as Assignment2/solutions/x.sql and the answer should be stored as Assignment2/answers/x.db, where x is the number of the test cases.

A shell script tester.sh is used to judge the out.db (except for the test case 3 which needs order by) with standard answers. In other word, a sentence like store <table> out is always required in your solution file.

tester.sh will sort out.db and turn it into out_sorted.db first, and compare out_sorted.db with the standard answer.

In total, tester.sh returns three states Passed, Failed and Skipped.

Passed: Your output, after sorting, agrees with the answer. (Note that for test case 3, no sorting will be done)
Failed: Your output is not consistent with answers after sorting.
Skipped: Cannot find the solution file of this test case.

For Failed test points, tester.sh will provide output comparison reports and run logs.

If you need to configure more test cases, just change the loop termination condition in tester.sh.

Run Custom Tests

$ bash Assignment2/tester.sh

GitHub Action Configuration

CI configuration starts with the basic environment (ubuntu or other OS with bash, Make, JDK 17), then run the command bash Assignment2/tester.sh.

The following is the CI configuration of this repository.

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    - name: Set up JDK 17
      uses: actions/setup-java@v3
      with:
        java-version: '17'
        distribution: 'temurin'
    - name: Compile the project
      run: make default
    - name: Run the assignment 2 test cases
      run: bash Assignment2/tester.sh

Database Structure

All the data are stored in the rows of each table. Rows are stored based on HashSets in tables, and tables are stored based on HashMaps in databases. For each table, it contains information about the name and data type of each column, and the rows can be traversed using an iterator.

Since the HashSets are unordered, the clause order by 'xxx' has no effect when the column 'xxx' is not in the result table.

Basic Syntax

^*Note that the students.db file used in examples is stored under the "readme-related files" folder.

create statement ::= create table <name> <table definition>
table definition ::= (<column name> <column data type>⁺,); | as <select clause>;

Example:
insert statement::= insert into <table name> values (<literal>⁺,)⁺,;

Example:
print statement ::= print <table name>;

Example:
load statement ::= load <name>;
store statement ::= store <file name without extension> <table name> ;
exit statement ::= quit; | exit ;
select statement ::= <select clause>;
select clause ::= select <column name>⁺, from <table name> <condition clause>;
Operator in select clause: =, <, <=, >, >=, !=

Example:

Advanced Syntax

Primary key ::= primary key <column name>; (It is used together with the table definition.)

Example:
Asterisk symbol ::= select * from <table name>;
Rename columns ::= select <column name>⁺, '<another name>'⁺, from <table name>;

Example:
Aggregated functions (avg, max, min, count, sum) ::= select <function> <column name>⁺, from <table name>;

Example:
select with round function ::= select round <column name> <operator> <operand> reserve <number of reserved bits> from <table name>;

Example:
select with in condition ::= select <column name>⁺, from <table name> where <column name> in <select clause>;

Example:
select with order by ::= select <column name>⁺, from <table name> order by '<column name>'⁺,<order>;

Example:
select with group by ::= select <column name>⁺, function <column name> from <table name> group by <column name>⁺,;

Example:
select with between condition ::= select <column name>⁺, from <table name> where <column name> between <lower bound> and <upper bound>;

Example:
select with like condition ::= select <column name>⁺, from <table name> where <column name> like <pattern>; (supported operator: '_' and '%')

Example:

Notes:

Aggregate functions can be used with "where" conditions only when there is "group by" clause, in that case, only the last argument can be an aggregate function.

"in" and "not in" can only be applied to the select clause with one table.

"in" and "not in" can be used along with other conditions, but it must be the last condition.

Version Control Syntax

commit statement ::= commit <table name>;

Example:
rollback to statement ::= rollback <table name> to <version code>;

Example:
rollback at statement ::= rollback <table name> at <version code>;

Example:

Standard Error Messages

Error Message	Explanation
Syntax Error	Unrecognizable command keywords Too many or too few arguments Divide something by 0 Unterminated literal or comment Wrong usage of commands
Format Error	Wrong data type when inserting or comparing Apply functions to unsupported type of data Wrong data type of arguments for some commands Not using correct utf-8 encoding Invalid SHA-1 code
Value Mismatch	Cannot find specified column or table or version or type Index out of range of a container Some lists should have identical length but actually don't. Duplicate names
FileFormatError	Unexpected end of input No header or datatype in the .db file Number of columns in a row does not equal that of the table.
FileNotFound	Cannot find specified file
VersionNotFound	Cannot find specified version More than one table share the same name

Re-implement Assignment2

For the specific results, please refer to the pdf file “Presentation slides.pdf". For code and .db file used in presentation, you can check the "presentation-related files" directory.

Hyperlinks

We have posted the presentation video on bilibili:

2022FALL CSC3170 Group2 Database-Messing-System Final Presentation.

Presentation slides: Presentation slides.pdf.

Besides this README.md, we have also set TODO.md to roughly show the things we have done.

Name		Name	Last commit message	Last commit date
Latest commit History 245 Commits
.github		.github
Assignment2		Assignment2
db61b		db61b
presentation-related files		presentation-related files
readme-related files		readme-related files
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TO-DO.md		TO-DO.md
project-description.md		project-description.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CSC3170 Course Project

Project Overall Description

Team Members

Project Specification

Project Abstract

Set Up Instruction

Prerequisites

Compile the Project

Run the Project

Custom Tests

Run Custom Tests

GitHub Action Configuration

Database Structure

Basic Syntax

Advanced Syntax

Version Control Syntax

Standard Error Messages

Re-implement Assignment2

Hyperlinks

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 7

Uh oh!

Languages

License

CSC3170-2022Fall/project-database-messing-system

Folders and files

Latest commit

History

Repository files navigation

CSC3170 Course Project

Project Overall Description

Team Members

Project Specification

Project Abstract

Set Up Instruction

Prerequisites

Compile the Project

Run the Project

Custom Tests

Run Custom Tests

GitHub Action Configuration

Database Structure

Basic Syntax

Advanced Syntax

Version Control Syntax

Standard Error Messages

Re-implement Assignment2

Hyperlinks

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 7

Uh oh!

Languages

Packages