Skip to content

Conversation

@penkk55
Copy link

@penkk55 penkk55 commented Jun 17, 2025

Solution notes

Task 01 – Run‑Length Encoder

[x] Done

  • Language: Go

  • Approach: I used a single-pass, rune-based iteration to count consecutive characters. The string is converted to a []rune to support UTF-8 characters. A strings.Builder accumulates the result by appending each character followed by its count.

  • Why: UTF-8 safe: Using []rune ensures multibyte characters (e.g. emojis or Thai letters)
    are handled properly.
    Efficient: strings.Builder is used to avoid repeated string concatenation (which would
    be inefficient).
    Linear time complexity: The algorithm scans the string once (O(n)).

  • Time spent: ~10 min

  • AI tools used: ChatGPT (for validation and write-up support)

Task 02 – Fix‑the‑Bug

[x] Done

  • Language: Go

  • Approach: The original code used a global current variable without synchronization, which caused data races when accessed from multiple goroutines. I fixed this by introducing a sync.Mutex to protect access to the shared variable. The NextID() function now uses mu.Lock() and mu.Unlock() to ensure only one goroutine can read and update current at a time.

  • Why: Using sync.Mutex guarantees thread safety and prevents race conditions by serializing access to the critical section. While it's not as fast as lock-free approaches like sync/atomic, it's simple, easy to understand, and sufficient for cases where performance is acceptable and clarity is preferred.

  • validation: go run -race tasks/02-fix-the-bug/go/buggy_counter.go

  • Time spent: ~15 min

  • AI tools used: ChatGPT (write-up support)

Task 03 – Sync-aggregator

[x] Done

  • Language: Go
  • Approach: I implemented a concurrent file processing system using a fixed-size worker pool (with sync.WaitGroup) and Go channels. Each worker processes a file by counting lines and words, while respecting a per‑file timeout using context.WithTimeout. File paths are resolved relative to the working directory using filepath.Abs. To maintain the correct order of results, each task is indexed and results are collected into a slice in input order.

I also added logic to:
Skip any file that starts with #sleep=N where N >= 5, returning a timeout status.

Ignore metadata lines starting with `#` for line/word counting.
  • Why: This approach ensures:

    Concurrency control (limits goroutines using a worker pool)

    Safe timeout enforcement (to prevent hanging or long-running file reads)

    Ordered results (matching the order of paths in filelist.txt)

    Compatibility with test runner environments (by resolving relative paths dynamically)

Using goroutines and channels allows for high throughput without sacrificing correctness. Applying file-level timeout ensures slow files don’t block the entire operation.

  • Time spent: ~70 min
  • AI tools used: ChatGPT [test troubleshooting, and edge-case handling, write-up support]

Task 04 – SQL-resoning

[x] Done

  • Language: Go (SQL)
  • Approach: For Task A, I computed the total pledged amount per campaign and calculated each campaign's percentage of its funding target using SUM() and GROUP BY. The result was ordered by pct_of_target descending.

For Task B, I calculated the 90th percentile (P90) of pledge amounts both globally and for donors from Thailand.

I used window functions (ROW_NUMBER, COUNT, OVER) to rank and compute each pledge's position.

Then applied linear interpolation to calculate the percentile accurately using a subquery join on rank.

Final result was rounded using ROUND(..., 0) to ensure integer output as expected in the test.

I added relevant indexes to optimize query performance, especially on donor.country, donor.id, pledge.donor_id, and pledge.amount_thb.

  • Why: Using SQL window functions and common table expressions (CTEs) makes the logic clear, maintainable, and performant even on large datasets.

Interpolation ensures accurate percentile computation instead of relying on simple LIMIT or approximation.

Indexes improve JOIN and filter performance significantly, especially for country = 'Thailand' and pledge amount ranking.

  • Time spent: ~30 min
  • AI tools used: ChatGPT [index strategy and write-up support]

Task 02 – Fix‑the‑Bug

[x] Done

  • Language: Python
  • Time spent: ~50 min
  • AI tools used: ChatGPT

SUMMARY

Completed all 4 tasks in Go (plus SQL for Task 4) with focus on correctness, concurrency safety, and clean logic.
Each task was implemented efficiently and verified with provided tests. Used Go’s standard library features like goroutines, channels, context timeouts, mutex locks, and SQL window functions. Edge cases (e.g. UTF-8 strings, file timeouts, percentile interpolation) were handled carefully to match expected outputs.

Used ChatGPT for validation, write-up clarity, and troubleshooting during complex logic (especially for Task 03).
Total time spent: ~125 min.

Tried Task 02 in Python just for fun, exploring how concurrency safety works differently in another language.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR submits completed solutions for the coding challenge, including implementations in Go, Python, and SQL for four tasks.

  • Implemented UTF-8–safe run-length encoding in Go.
  • Fixed concurrency bugs in Go and Python counters and added a Python concurrency test.
  • Built a concurrent file-stats aggregator with worker pool and timeouts in Go.
  • Added SQL queries (Task A & B) with Go wrappers and supporting indexes.

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tasks/01-run-length/go/rle.go Added Encode implementation for run-length encoding.
tasks/02-fix-the-bug/go/buggy_counter.go Wrapped NextID in a mutex to ensure thread safety.
tasks/02-fix-the-bug/python/buggy_counter.py Introduced lock for _current, added main() hook.
tasks/02-fix-the-bug/python/test2_counter.py Added concurrency test to detect duplicate IDs.
tasks/03-sync-aggregator/go/aggregator.go Implemented Aggregate with worker pool and timeouts.
tasks/04-sql-reasoning/taska.sql Added campaign funding percentage query.
tasks/04-sql-reasoning/taskb.sql Added 90th-percentile SQL with interpolation.
tasks/04-sql-reasoning/go/queries.go Exposed SQLA/SQLB and index statements in Go code.
SOLUTIONS.md Documented solution approaches and time spent.
Comments suppressed due to low confidence (3)

tasks/01-run-length/go/rle.go:11

  • Consider adding unit tests for Encode, covering edge cases like the empty string and multi-byte characters to verify correct behavior.
func Encode(s string) string {

SOLUTIONS.md:80

  • There’s a duplicate ### Task 02 – Fix-the-Bug section in the solutions doc; remove the redundant heading to avoid confusion.
### Task 02 – Fix‑the‑Bug

tasks/02-fix-the-bug/python/buggy_counter.py:7

  • The threading module isn’t imported; add import threading at the top so threading.Lock() is defined.
_lock = threading.Lock()  # Protects access to _current

print(f"Current ID: {_current}")
value = _current
time.sleep(0) # Optional; simulates work
_current += 1
Copy link

Copilot AI Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The increment of _current occurs outside the lock. Move _current += 1 inside the with _lock: block to ensure the update is thread-safe.

Suggested change
_current += 1
_current += 1 # Increment inside the lock to ensure thread safety

Copilot uses AI. Check for mistakes.
Comment on lines +125 to +128
absBase, err := filepath.Abs("../data")
if err != nil {
return nil, err
}
Copy link

Copilot AI Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Hardcoding "../data" for the base directory is brittle; consider deriving the data directory relative to filelistPath or passing it in as a parameter.

Suggested change
absBase, err := filepath.Abs("../data")
if err != nil {
return nil, err
}
baseDir := filepath.Dir(filelistPath)
absBase := filepath.Join(baseDir, "data")

Copilot uses AI. Check for mistakes.
var Indexes = []string{} // skipped
var Indexes = []string{
`CREATE INDEX IF NOT EXISTS idx_donor_country ON donor(country);`,
`CREATE INDEX IF NOT EXISTS idx_donor_id ON donor(id);`,
Copy link

Copilot AI Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Indexing donor(id) is redundant since id is the primary key and already indexed; you can remove this to simplify maintenance.

Suggested change
`CREATE INDEX IF NOT EXISTS idx_donor_id ON donor(id);`,

Copilot uses AI. Check for mistakes.
`CREATE INDEX IF NOT EXISTS idx_donor_country ON donor(country);`,
`CREATE INDEX IF NOT EXISTS idx_donor_id ON donor(id);`,
`CREATE INDEX IF NOT EXISTS idx_pledge_donor_id ON pledge(donor_id);`,
`CREATE INDEX IF NOT EXISTS idx_pledge_amount ON pledge(amount_thb);`,
Copy link

Copilot AI Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The index on pledge(amount_thb) is unlikely to be used by these queries (no direct filter on amount); removing it can reduce unnecessary index overhead.

Suggested change
`CREATE INDEX IF NOT EXISTS idx_pledge_amount ON pledge(amount_thb);`,

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants