Hey there 👋
First off, we put together this assignment especially for this interview, so if you think things are unclear, don't hesitate to ask us questions.
This assessment represents a slice of the address matching problem we face at Root Sustainability.
The system works with messy input data and needs to:
- Find plausible address matches using external geocoding data
- Score match quality in a meaningful and interpretable way
- Surface uncertainty and data-quality issues
- Support both automated and human-review workflows
You will extend a backend that:
- Talks to the Mapbox geocoding API to find candidate address matches
- Computes a match score between the original address and the best matched address
- Exposes a small HTTP API that the provided React frontend consumes
The repository is organized as:
frontend/– React app (Vite + TypeScript) to visualise and manage addressesbackend/– FastAPI starter backend that you will extenddata/– Test data to validate your solutionREADME.md– this file
This assignment is not about building the perfect model. It's about showing how you reason about an applied AI problem and translate that reasoning into working software.
You will:
- Build an address similarity function returning a score between
0.0and1.0 - Do some experimentation and document your reasoning
- Integrate your solution into the existing frontend/backend setup
Approximate time budget: 2–3 hours, if you run out of time, focus on 2.1/2.2
- Get and configure a Mapbox access token
(Mapbox requires a credit card even for the free tier; if you prefer not to, contact us and we’ll provide a token.) - Explore the API and decide how to select the “best” match
- Implement your solution in
backend/mapbox_client.py
We have provided a baseline implementation of the address similarity function in backend/similarity.py.
The address similarity function should:
- Return a value in
[0.0, 1.0] - Represent whether two addresses likely point to the same real-world entity
- Be reasonably robust to:
- Language differences
- Spelling variations
- Capitalization
- Formatting differences
We would like you to:
- Explore multiple approaches qualitatively
- Compare them quantitatively using the dataset in
data/addresses.csv - Pick one final approach, explain why, and implement it in
backend/similarity.py - Document what you tried, what you chose, and what you would explore next in
EXPERIMENTS.md
We care at least as much about your reasoning as about the final score.
If you have time left, feel free to:
- Do some API or code polishing
- Play with your solution via the frontend
Pay attention to how the system behaves in practice:
- Latency, failures, or confusing behaviour
- Ambiguous input or over-confident scores
We’ll discuss any limitations you might have noticed during the interview.
A minimal FastAPI backend starter is provided in backend/. It includes:
- Data models for addresses
- An SQLite database and SQLAlchemy ORM model
- Endpoint skeletons matching the contract below
- A naive Mapbox integration and similarity baseline
An address record has the following shape:
{
"id": 1,
"address": "Parijs, Frankrijk",
"matched_address": "Paris, France",
"match_score": 0.98
}Where:
address– raw input address as entered by a usermatched_address– the best match returned by Mapboxmatch_score– a float in[0, 1]indicating how likely these refer to the same real-world entity
(1.0= clearly the same,0.0= clearly not)
Your backend should implement the following endpoints:
-
List all addresses
GET /addresses- Response:
200 OKwithAddress[]
-
Create an address
POST /addresses- Body:
{ "address": "string" } - Behaviour:
- Call Mapbox to find the best match
- Calculate a similarity score
- Store the record
- Return
201 Created(or200 OK) with the fullAddressobject
-
Get a single address
GET /addresses/{id}- Response:
200 OKwithAddress 404if not found
-
Update an address
POST /addresses/{id}- Body:
{ "address": "string" } - Behaviour:
- Recalculate Mapbox match and similarity score
- Store the updated record
- Return
201 Created(or200 OK) with the fullAddressobject
-
Refresh scores of one or more addresses
POST /addresses/refresh- Body:
{ "ids": [1, 2, 3] } - Behaviour:
- Recalculate matches and scores
- If
idsisnull, refresh all records - Return
200 OK
From the repository root:
cd backend
pip install -r requirements.txt
uvicorn main:app --reload --port 8000API:
A small React + TypeScript frontend is provided in frontend/. It allows you to:
- View all addresses in a table
- Select rows via checkboxes
- Add a new address
- Inspect a single address in detail and update it
- Refresh scores for selected or all addresses
cd frontend
npm install
npm run devSend us a link to a Git repository on the day before your interview containing:
- Your backend implementation
- Any notebooks or scripts used during experimentation
- Your
EXPERIMENTS.md
Good luck! 🔥