Everything you need to know about UCL DSS workshops!
Welcome to the UCL Data Science Society ✨ ! This is a quick starter pack for a Science executive and anyone else who is involved in the planning, design, scheduling, implementation and maintenance of the workshops.
Big shout out to @oleksiysmola Alex for implementing more than ten amazing workshops two years agao as the Head of Science and continuing to contribute to the work at Science last year. Without his legacy workshop materials, we would never be able to achieve what we have today. Thanks, Alex!
A further shout out to Tony for contiuning this great work as Head of Science last year, updating existing material and developing several new workshops alongside his team of executives! Again, without his work we would not be able to continue to build on the success of this society!
This year, we have catagorised our workshops into four main themes:
- Introduction to Python Programming 💻
- Toolkits for Data Scientists 🔬
- Data Science with Python 🔮
- Data Science Fields 🥼
Legend:
- 🟥: Not implemented
- 🟨: Legacy material available, to be updated or re-written
- 🟦: Available, modification required
- 🟩: Good to go
| Code | Topic | Prerequisite | Assigned To | Status |
|---|---|---|---|---|
PY01 |
Fundamentals | None | Philip | 🟩 |
PY02 |
Sequence: Lists and Tuples | PY01 |
Philip | 🟩 |
PY03 |
Logic | None | Philip | 🟩 |
PY04 |
Functions | PY01 PY02 |
TBC | 🟥 |
PY05 |
Object Orientated Programming | PY01 PY02 PY03 PY04 |
Philip | 🟩 |
PY1x |
Algorithms | PY01 PY02 PY03 PY04 |
All | A series, TBC |
PYTx |
Troubleshooting Sessions: PATH, Jupyter Notebooks, Intepreters, pip and conda |
None | All | A series, TBC |
| Code | Topic | Prerequisite | Assigned To | Status |
|---|---|---|---|---|
TK01 |
Numpy | Basic Python | Philip | 🟨 |
TK02 |
Pandas | Basic Python | TBC | 🟨 |
TK03 |
matplotlib | Basic Python | TBC | 🟨 |
TK04 |
Git and GitHub | None | Philip | 🟩 |
TK05 |
SQL | None | Philip | 🟩 |
| Code | Topic | Prerequisite | Assigned To | Status |
|---|---|---|---|---|
DS01 |
Linear Model Regression | TK01 TK02 TK03 |
Zeyan | 🟨 |
DS02 |
Logistic regression | TK01 TK02 TK03 DS02 |
Seda | 🟨 |
DS03 |
Ridge and Lasso Regression | TK01 TK02 TK03 DS01 DS02 |
Philip | 🟩 |
DS04 |
Decision Trees | ``TK01 TK02` `TK03` |
Philip | 🟩 |
DS05 |
Random Forest | TK01 TK02 TK03 DS04 |
Philip | 🟩 |
DS06 |
Support Vector Machine | TK01 TK02 TK03 DS04 DS05 |
TBC | 🟨 |
DS07 |
K-means clustering | TK01 TK02 TK03 |
Philip | 🟩 |
DS08 |
Hierarchical clustering | TK01 TK02 TK03 DS07 |
Philip | 🟩 |
DS09 |
[DBScan clustering] | TK01 TK02 TK03 DS07 DS08 |
TBC | 🟥 |
DS10 |
Dimensionality reduction | TK01 TK02 TK03 |
Philip | 🟩 |
DS11 |
Introduction to Neural Network | Basic Python | TBC | 🟨 |
| Code | Topic | Prerequisite | Assigned To | Status |
|---|---|---|---|---|
DSF01 |
Data Science for Finance | PY01, PY02, PY03 |
Zeyan | 🟥 |
DSF02 |
Spatial Data Science | PY01, PY03, PY03, DS03, DS02 |
Philip | 🟥 |
DS12| CNNs | DS11 | Sebastian, Stefania | 🟨
DS13| k-NN | DS01 DS02 DS03 | Tania | 🟨
DS14| Word Embedding | DS11 | Sebastian, Stefania | 🟨
In order to maintain the consistency and quality of our workshops, please follow these rules throughout your implementations.
A recommended format for Python programming related workshops is Jupyter Notebooks. Other formats include Markdown, PDF, PowerPoint and so on. The following are some guidelines to follow.
-
Title:
<Code> - <Theme>:<Topic>, e.g.PY01 - Introduction to Python Programming:Fundamentals -
Use clear sub-title structure throughout the notebooks (
h1toh6in Markdown syntax) -
Use Markdown syntax throughtout the notebooks: e.g. italic, bold, bold and italic, Object,
inlineCode(), quoteblocks and codeblocks and so on. This document is an example. -
Keep a consistency. For instance, I prefer to use a quoteblock for definition:
Computer Science: The study of computation and information. Computer science deals with theory of computation, algorithms, computational problems, and the design of computer systems hardware, software, and applications.
-
Use LaTeX for mathematical formulae and expressions, both inline and equation blocks.
-
Use relative path instead of absolute path for inserting pictures, if there is any. Put pictures in
assets/ -
Remove the answers from the questions and exercises, or put them in to different codeblocks to reserve spaces for attempting
All of above, and:
- If possible, export a PDF version using Typora with theme
Ursine Umbrella
- Use DSS logo on every page.
To be continued
Alex's legacy workshops had set a plausible structure for workshops. Example would be:
.
├── some-workshop
│ ├── README.md
│ ├── workshop.ipynb
│ ├── problem.ipynb
│ ├── answer.ipynb
└── assets
└── figure1.pngworkshop: the workshop material in.ipynb,.pdf,.md,.pptand so onproblem: exercises with no answers on itanswer: answers toproblemREADME: a syllabus of the courseassets/: where you put the pictures and attachments to
The README should have:
- Title
- Description
- Prerequisite (use workshop
codeif possible) - Author and How to Contact Author
- Objectives and Outcome
- Outline (just copy and paste the title structure from
h1toh6in your workshop in a tree structure)
The kanbans are used for project management:
- Data Science with Python Kanban
- Toolkits for Data Scientists Kanban
- Introduction to Python Programming Kanban
Create an issue in each repository to represent a TODO. The issues will be added to the Kanban as a card with a person assigned to it. Label the cards well. Close the issue when finished.
-
Create a repository with
UCL-DSSaccount, the name of thye repository should be<keyword>-workshop(e.g.python-logic-workshop) -
Forkthe repository to your own warehouse. Maintain your own repository.Keep commits small and use multiple commits. E.g.
Commit#1: Write introductionCommit#2: Update question 1instead ofCommit#1: Implement workshop 1 -
Create
issuefor TODOs. Keep comments and communications on the issues to that issue thread. Keep track of the kanbans to ensure your tasks are managed. -
Make a
pull requestto themainbranch of theUCL-DSSrepository when finished.Linkthat pull request to the issues related. I willmergeit andclosethe issues manually.
Since this academic year the lectures are mostly pre-recorded, so Monday to Wednesday should be a good fit to students' schedule. Each week, I will issue a ticket to the relavent executives and FYR with a workshop to do. The executive will contact Marketing and provide your marketing information. Usually, that includes:
- Your name
- Your picture
- Theme
- Code
- Title
- Prerequisite
- Difficulty (1~5)
- GitHub repository link
- Preparation needed
- Time and Date using London Time - you decide on which day (Mon ~ Wed) and exact time to host the workshop. With consideration for people in UTC+1/+0 to UTC+8
All workshops are hosted using our MS Teams channel and will be recorded. Most of the information can be found above.
Hello to our fellow young data scientists from South Kensington! Greeting from Gower Street!
To be continued