Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

README.md

Free Datasets for Practices

Open Datasets

There are many open datasets that you can download for practicing activities in big data platforms. It is suggested that you focus on a single domain and use the data to perform the work in the course. The following datasets can be used for both batch and streaming analytics and used in different tasks (ingestion, processing, etc.)

When using these datasets, you need to comply with their corresponding licenses.

Datasets Open for this course only

The datasets are stored within this directory and provided by Linh Truong for the course. Keep in mind the license of the data.

  • bts data: in bts, samples of sensor data monitored within base transceiver stations
  • network operation monitoring: in onudata, a small dataset about network monitoring

Your own datasets

You can also propose your own dataset for your assignment but you must discuss with the lecturer first.