Add dataset parse/load library by wzhao18 · Pull Request #13 · ampersand-projects/streambench

wzhao18 · 2022-02-20T01:25:18Z

No description provided.

anandj91

struct Row {
    string dataset;
};

struct CSVRow : public Row {
    vector<string> cols;
};

template<typename R>
class DataParser {
protected:
    virtual void run() = 0;
    virtual void parse(fstream&) = 0;
    virtual void decode(R&, stream::stream_event&) = 0;

    bool write_serialized_to_ostream(stream::stream_event &t)
    {
        if (!google::protobuf::util::SerializeDelimitedToOstream(t, &cout)) {
            cerr << "Fail to serialize data into output stream" << endl;
            return false;
        }
        return true;
    }

};

class CSVParser : public DataParser<CSVRow> {
protected:
    void parse_file(fstream &file)
    {   
        string line;
        getline(file, line);

        while (getline(file, line)) {
            CSVRow row;
            string word;
            stringstream ss(line);

            while (getline(ss, word, ',')) {
                row.cols.push_back(word);
            }

            stream::stream_event data;
            decode(row, data);
            if (!write_serialized_to_ostream(data)) {
                break;
            }
        }
    }   
};

consider something like the above organization of the base classes.

Additionally, try to follow some standard styling practice when you write code. Like the position of braces of classes and functions.

dataset_util/include/data_parser.h

anandj91 · 2022-03-28T15:55:03Z

dataset_util/include/taxi_data_parser.h

Consider using two different parsers for fare and trip datasets. You don't need to combine them in one.

The intention of combining the parsers for fare and trip datasets is that both can be streamed to std out at the same time. The taxi benchmark requires two streams.

WeiZhao added 5 commits February 19, 2022 20:23

Add dataset parse/load library and use it in tilt

7ed7b89

combine parser and data-gen

8b803ef

Parse taxi trip dataset as a folder

bb819ad

Add taxi fare dataset

e41008d

Add java decoder for taxi dataset

975ec9c

wzhao18 force-pushed the stream-dataset branch from 27aaaab to 975ec9c Compare February 23, 2022 00:21

WeiZhao added 7 commits February 22, 2022 19:52

Add documentation for dataset parser/loader

112a53e

Refactor dataset_util

c3341b1

Add protobuf submodule

72efa60

Add dataset loader in trill

1c19a5e

Add loader example for csharp

9ea6c9f

Add documentation for loading dataset with c++ and c#

2fc8253

load dataset by command line arguments

6f230d7

wzhao18 force-pushed the stream-dataset branch from b1959aa to 6f230d7 Compare February 24, 2022 03:03

WeiZhao added 10 commits February 28, 2022 01:10

Add parsing for vibration data

90f1e6a

Fix header file definitions

1b4e45e

Separate payload from message

c1161ab

Clean up Main for Trill benchmark

aff95c2

Parser can parse different datasets at each run

e3d905c

Remove loader folder

0b505c5

Fix CMAKE file to accept folder of protos

a7395f2

Add Vibration dataset parser for trill

bfc7232

Wrapper for stream event, allows to check the type of protobuf message

22c4b88

Remove debugging code

061d2af

anandj91 requested changes Mar 28, 2022

View reviewed changes

dataset_util/include/data_parser.h Outdated Show resolved Hide resolved

dataset_util/include/data_parser.h Outdated Show resolved Hide resolved

anandj91 reviewed Mar 28, 2022

View reviewed changes

wzhao18 requested a review from anandj91 May 4, 2022 03:33

Add partition key to stream_event

9537d61

wzhao18 force-pushed the stream-dataset branch from d1e1e7e to 9537d61 Compare May 4, 2022 03:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dataset parse/load library#13

Add dataset parse/load library#13
wzhao18 wants to merge 23 commits intoampersand-projects:masterfrom
wzhao18:stream-dataset

wzhao18 commented Feb 20, 2022

Uh oh!

anandj91 left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

anandj91 Mar 28, 2022

Uh oh!

wzhao18 May 4, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

wzhao18 commented Feb 20, 2022

Uh oh!

anandj91 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

anandj91 Mar 28, 2022

Choose a reason for hiding this comment

Uh oh!

wzhao18 May 4, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

anandj91 left a comment •

edited

Loading