Author: Gábor Békés, Central European University (Austria, EU)
Version: v.0.4. 2025-04-04
This course will equip students, who are already versed in core data analysis methods, with experience to harness AI technologies to improve productivity (yes this is classic LLM sentence). But, yeah, the idea is to help students who studied data analysis / econometrics / quant methods and want to think about how to include AI in their analytics routine, and spend time to share experiences.
As AI becomes more and more powerful, it is also important to provide a platform to dicuss human agency in data analysis. So a key element of the course and its instructor to lead discussions on the role of AI and humans in various aspects of data analysis.
The course focuses on using large language models (LLMs) such as OpenAI's ChatGPT, Anthropic Claude.ai, Mistral's Le Chat, and Google's Gemini) to carry out tasks in data analysis. It includes topics like data extraction and wrangling, data exploration and descriptive statistics, and creating reports as well as turning text to data.
There are three case studies that we use (1) a simulated set of data tables on hotels in Austria, (2) The World Value Survey, and (3) A series of interview textst.
The course material includes weekly practice assignments.
You need a background in Data Analysis / Econometrics, a good introductory course is enough. I, of course, suggest Chapters 1-12 and 19 of Data Analysis for Business, Economics and Policy (Cambridge UP, 2021). Full slideshows, data and code are open source. But consider buying the book. In particular, the course builds on Chapters 1-6 and 7-10, and 19 of Data Analysis but other Introductory Econometrics + basics of data science knowledge is ok.
Students are expected to have some basic coding knowledge in Python or R (Stata also fine mostly).
AI is everywhere and has become essential, most analytical work will be using it.
Key outcomes. By the end of the course, students will be able to
- Gain experience and confidence using genAI to carry out key tasks in data analysis.
- Build AI in coding practice including data wrangling, description and reporting and text analysis
- Have some idea of use cases when AI assistance is (1) greatly useful, (2) helpful, (3) currently problematic.
- Have some idea of use cases when AI assistance is OK to use as is vs needs strong human supervision
- Have an understanding of resources to follow for updates.
This is a couse aimed at 3rd (2nd?) year BA and (A students) in any program with required background.
But, anyone can use it with adequate background.
Assignments are available for all classes
**Important to note for assignments: **
- Use AI but do not submit something that was created by AI. AI is your assistant.
- One of the goals of the course is to practice this.
What are LLMs, how is the magic happening. A non-technical brief intro. How to work with LLMs? Plus ideas on applications. Includes suggested readings, podcasts, and vids to listen to.
Which AI? See my take on current models. As of May 2025.
Learn how to write a clear and professional code and data documentation. LLMs are great help once you know the basics.
Case study: World Values Survey. Data is at WVS
You have your data and task, and need to write a short report. We compare different options with LLM, from one-shot prompt to iteration.
Case study: World Values Survey. Data is at WVS
When asked about what I shall add to my textbook, David Card, the Nobel winning empirical economist told me that I shall spend time with joining tables. Here we go.
Case study: simulated Austrian hotels. Data is at hotels
No course of mine can escape football (soccer). Here we look at post-game interviews to learn basics of text analysis and apply LLMs in what they are best - context dependent learning. Two class series. First is more intro to natural language processing.
Case study: football post-game interviews. Data is at interviews
Second class, now we are in action. How does LLM compare to humans?
Case study: football post-game interviews. Data is at interviews
I'm adding material to learn-more folder. You can start with the beyond page.
Attribution: Békés, Gábor: "Doing Data Analysis with AI: a short course", available at github.com/gabors-data-analysis/da-w-ai/, v0.5, 2025-05-14
License: CC BY-NC-SA 4.0 -- share, attribute, non-commercial (contact me for corporate gigs)
Thanks: Developed mostly by me, Gábor Békés Thanks a million to the two wonderful human RAs, Ms Zsuzsa Vadle and Mr Kenneth Colombe, both Phd students. Thanks to Claude.ai that did a great deal of help in creating the simulated dataset. ChatGPT and Claude.ai helped create the slideshows and educated me on NLP.
Thanks for CEU's teaching grant that allowed me pay humans and AI.
This material is based my course at CEU in Vienna, Austria. I teach it from this Github repo.
If you have questions or suggestions or interested to learn more, just fill in this form.
AI use is very costly in terms of energy. Yes, it is becoming cheaper. But humanity is also using much more of it.

