Skip to content

CSC3170-2022Fall/project-database-messing-system

Repository files navigation

Open in Visual Studio Code

CSC3170 Course Project

Project Overall Description

This is our implementation for the course project of CSC3170, 2022 Fall, CUHK(SZ). For details of the project, you can refer to project-description.md. In this project, we will utilize what we learned in the lectures and tutorials in the course, and implement either one of the following major job:

  • Application with Database System(s)
  • Implementation of a Database System

Team Members

Our team consists of the following members, listed in the table below (the team leader is shown in the first row, and is marked with 🚩 behind his/her name):

Student ID Student Name GitHub Account (in Email) GitHub Username
121090001 安子航 🚩 2284874018@qq.com @i-cookie
121090184 侯天赐 enderturtle@foxmail.com @EnderturtleOrz
121020163 沈驰皓 stevenshen3641@outlook.com @StevenShen3641
121090519 涂喻钊 121090519@link.cuhk.edu.cn @tyzzzzzzzzz
121090628 夏禹扬 2467925095@qq.com @xqbf
121090841 郑莹琪 121090841@link.cuhk.edu.cn @Aurora121090841

Project Specification

After thorough discussion, our team made the choice and the specification information is listed below:

  • Our option choice is: Option 3

Project Abstract

This project writes a miniature relational database management system (DBMS) that stores data tables containing labeled information columns. The project consists of the language system and the version control system. In language system, we defined the data definition language (DDL) and data manipulation language (DML) and wrote the DDL interpreter and DML in the java language compiler to interpreting users' input and dealing with data in tables. The version control system is standard practice for maintaining a project and tracking it from inception to finalization. In addition, version control is a software engineering technique to ensure that the same program files edited by different people are synchronized during the software development process, which play an essential role in a such multi-person cooperative project. We will only deal with tiny databases for this project, so we will not consider too much about speed and efficiency. But we will still consider part of the efficiency improvement when designing the DBMS. Here's what we implemented in this system:

Basic coding:

  • Filling the code templates provided by UCB.

Advance coding:

  • Take data type (int/double/string) into consideration while creating the table and doing other operations;
  • Asterisk symbol '*';
  • Rename the columns;
  • Implement the operations including commit, rollback;
  • Implement aggregate functions including max(), min(), avg(), sum(), round(), count();
  • Implement additional keyword including as, like, between, where (not) in, order by, group by, primary key;
  • Version Control: Use snapshot strategy with SHA-1 as version name and trie as version tree;
  • Application: re-implement Assignment 2.

Set Up Instruction

Prerequisites

  • jdk >= 17

  • Make >= 4.2.1

Compile the Project

$ git clone https://github.com/CSC3170-2022Fall/project-database-messing-system.git
$ cd project-database-messing-system
$ make default

Run the Project

$ java db61b.Main

Custom Tests

This project is configured with test cases from CSC3170-2022Fall Assignment2. A modified version of assignment description is provided here.

The solution should be stored as Assignment2/solutions/x.sql and the answer should be stored as Assignment2/answers/x.db, where x is the number of the test cases.

A shell script tester.sh is used to judge the out.db (except for the test case 3 which needs order by) with standard answers. In other word, a sentence like store <table> out is always required in your solution file.

tester.sh will sort out.db and turn it into out_sorted.db first, and compare out_sorted.db with the standard answer.

In total, tester.sh returns three states Passed, Failed and Skipped.

  • Passed: Your output, after sorting, agrees with the answer. (Note that for test case 3, no sorting will be done)

  • Failed: Your output is not consistent with answers after sorting.

  • Skipped: Cannot find the solution file of this test case.

For Failed test points, tester.sh will provide output comparison reports and run logs.

If you need to configure more test cases, just change the loop termination condition in tester.sh.

Run Custom Tests

$ bash Assignment2/tester.sh

GitHub Action Configuration

CI configuration starts with the basic environment (ubuntu or other OS with bash, Make, JDK 17), then run the command bash Assignment2/tester.sh.

The following is the CI configuration of this repository.

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    - name: Set up JDK 17
      uses: actions/setup-java@v3
      with:
        java-version: '17'
        distribution: 'temurin'
    - name: Compile the project
      run: make default
    - name: Run the assignment 2 test cases
      run: bash Assignment2/tester.sh

Database Structure

All the data are stored in the rows of each table. Rows are stored based on HashSets in tables, and tables are stored based on HashMaps in databases. For each table, it contains information about the name and data type of each column, and the rows can be traversed using an iterator.

Since the HashSets are unordered, the clause order by 'xxx' has no effect when the column 'xxx' is not in the result table.

Basic Syntax

*Note that the students.db file used in examples is stored under the "readme-related files" folder.

  • create statement ::= create table <name> <table definition>

  • table definition ::= (<column name> <column data type>+,); | as <select clause>;

    Example:

  • insert statement::= insert into <table name> values (<literal>+,)+,;

    Example:

  • print statement ::= print <table name>;

    Example:

  • load statement ::= load <name>;

  • store statement ::= store <file name without extension> <table name> ;

  • exit statement ::= quit; | exit ;

  • select statement ::= <select clause>;

  • select clause ::= select <column name>+, from <table name> <condition clause>;

  • Operator in select clause: =, <, <=, >, >=, !=

    Example:

Advanced Syntax

  • Primary key ::= primary key <column name>; (It is used together with the table definition.)

    Example:

  • Asterisk symbol ::= select * from <table name>;

  • Rename columns ::= select <column name>+, '<another name>'+, from <table name>;

    Example:

  • Aggregated functions (avg, max, min, count, sum) ::= select <function> <column name>+, from <table name>;

    Example:

  • select with round function ::= select round <column name> <operator> <operand> reserve <number of reserved bits> from <table name>;

    Example:

  • select with in condition ::= select <column name>+, from <table name> where <column name> in <select clause>;

    Example:

  • select with order by ::= select <column name>+, from <table name> order by '<column name>'+,<order>;

    Example:

  • select with group by ::= select <column name>+, function <column name> from <table name> group by <column name>+,;

    Example:

  • select with between condition ::= select <column name>+, from <table name> where <column name> between <lower bound> and <upper bound>;

    Example:

  • select with like condition ::= select <column name>+, from <table name> where <column name> like <pattern>; (supported operator: '_' and '%')

    Example:

    Notes:

    Aggregate functions can be used with "where" conditions only when there is "group by" clause, in that case, only the last argument can be an aggregate function.

    "in" and "not in" can only be applied to the select clause with one table.

    "in" and "not in" can be used along with other conditions, but it must be the last condition.

Version Control Syntax

  • commit statement ::= commit <table name>;

    Example:

  • rollback to statement ::= rollback <table name> to <version code>;

    Example:

  • rollback at statement ::= rollback <table name> at <version code>;

    Example:

Standard Error Messages

Error Message Explanation
Syntax Error
  • Unrecognizable command keywords
  • Too many or too few arguments
  • Divide something by 0
  • Unterminated literal or comment
  • Wrong usage of commands
  • Format Error
  • Wrong data type when inserting or comparing
  • Apply functions to unsupported type of data
  • Wrong data type of arguments for some commands
  • Not using correct utf-8 encoding
  • Invalid SHA-1 code
  • Value Mismatch
  • Cannot find specified column or table or version or type
  • Index out of range of a container
  • Some lists should have identical length but actually don't.
  • Duplicate names
  • FileFormatError
  • Unexpected end of input
  • No header or datatype in the .db file
  • Number of columns in a row does not equal that of the table.
  • FileNotFound
  • Cannot find specified file
  • VersionNotFound
  • Cannot find specified version
  • More than one table share the same name
  • Re-implement Assignment2

    For the specific results, please refer to the pdf file “Presentation slides.pdf". For code and .db file used in presentation, you can check the "presentation-related files" directory.

    Hyperlinks

    We have posted the presentation video on bilibili:

    2022FALL CSC3170 Group2 Database-Messing-System Final Presentation.

    Presentation slides: Presentation slides.pdf.

    Besides this README.md, we have also set TODO.md to roughly show the things we have done.

    About

    project-database-messing-system created by GitHub Classroom

    Resources

    License

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published

    Contributors 7

    Languages