Skip to content

CellerCity/SyncText-CRDT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SyncText: A CRDT-Based Collaborative Text Editor

1. Project Overview

SyncText is a real-time, multi-user collaborative text editor that simulates the core functionality of modern tools like Google Docs. It is built in C++ and runs on a Linux environment.

The system is fully decentralized (peer-to-peer) and uses a Conflict-Free Replicated Data Type (CRDT) model to ensure that all users' documents converge to the same state. Synchronization is achieved in a lock-free manner using POSIX shared memory for user discovery and POSIX message queues for inter-process communication (IPC).

2. Core Features

Multiple Concurrent Users: Supports up to 10 users editing simultaneously in separate terminals.

Decentralized Architecture: No central server; each user is a peer.

Automatic Change Detection: The system automatically detects when a user saves changes to their local file by monitoring the file's modification time.

Real-Time Synchronization: Changes are batched (default N=5) and broadcast to all other active users via message queues.

Lock-Free CRDT Merging: Uses a Last-Writer-Wins (LWW) per-span algorithm to resolve conflicts deterministically without using any mutexes or locks.

Atomic Operations: User registration in shared memory is handled atomically to prevent race conditions.

Multi-Threaded: A dedicated listener thread handles all incoming messages, while the main thread manages local file changes and broadcasting.

3. Dependencies

Compiler: g++ (C++17)

Build System: make

Libraries:-

pthread: For multi-threading (pthread_create, pthread_join).

lrt: The POSIX Real-Time Extensions library, required for message queues (mq_open, mq_send, etc.).

4. Platform

OS: Developed and tested on Ubuntu 22.04.4 LTS.

Compiler: g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0.

5. Compilation Instructions

A Makefile is provided for easy compilation.

Ensure you have 'g++' and 'make' installed.

From the project's root directory first relocate to the 'source_files' directory cd source_files

then simply run: make

This will compile all .cpp source files and create a single executable named editor.

To clean up all object files (.o) and the executable, run: make clean

6. Execution Instructions

First of all ensure that you are in the source_files directory.

Give execute permission to "editor" executable if it fails to run for some reason [shouldn't happen]. chmod +x editor

To run the system, you must open multiple terminals (one for each user) and run the editor executable with a unique user ID as a command-line argument.

Example for 3 users:

Terminal 1: ./editor user_1

Terminal 2: ./editor user_2

Terminal 3: ./editor user_3

Each user's terminal will initialize, display the list of active users, and begin monitoring their local document (user_1_doc.txt, user_2_doc.txt, etc.).

Note:

All these local document copies (1 for each user) will be created in the 'source_files' directory itself.

To close the editor for a user, simply press Ctrl+C. This would handle all the cleaning of the shared memory contents that would have been created and then close the application properly.

7. How to Test (Example Scenarios)

The BROADCAST_THRESHOLD is set to 5. This means a user will broadcast their changes after accumulating 5 operations.

Scenario 1: Non-Conflicting Edits

Start user_1 and user_2.

user_1: Edit user_1_doc.txt, change Line 0 to "Hello" and save. (1 op in send_buffer) user_2: Edit user_2_doc.txt, change Line 1 to "World" and save. (1 op in send_buffer) user_1: Edit user_1_doc.txt, change Line 2 to "Test" and save. (2 ops in send_buffer. Note: 3 more ops are needed to trigger broadcast).

user_1 will broadcast once the threshold is met.

Observe user_2: user_2's terminal will receive the updates, merge them, and the document will display:

Line 0: Hello [MODIFIED] Line 1: World Line 2: Test [MODIFIED]

The log will show "Received update from user_1..." and "All updates merged successfully."

Scenario 2: Conflicting Edits (LWW Resolution)

Start user_1 and user_2. (Assume user_1 < user_2 lexicographically).

user_1: Edit Line 0 to "10" and save. (OpA, T1) user_2: Edit Line 0 to "20" and save. (OpB, T2, where T2 > T1) user_1: Edit Line 1 to "30" and save. (OpC)

user_1's buffer ([OpA, OpC]) is not yet full. user_1 makes 3 more edits to trigger a broadcast.

Observe user_2:

user_2 receives user_1's ops, including OpA and OpC. merge_and_apply_updates runs. It gathers remote ops [OpA, OpC, ...] and local op [OpB].

A conflict is detected on Line 0 between OpA (T1) and OpB (T2). LWW Resolution: OpB (T2) wins because T2 > T1. OpA is marked as a loser.

user_2's send_buffer is pruned. OpB (local winner) is kept.

Remote op OpC is applied.

user_2's Final State:

Line 0: 20 Line 1: 30 [MODIFIED]

user_2: Edit Line 2 to "40" and save. (OpD) user_2's buffer ([OpB, OpD]) is not yet full. user_2 makes 3 more edits to trigger a broadcast.

Observe user_1: user_1 receives user_2's ops, including OpB and OpD. merge_and_apply_updates runs.

A conflict is detected on Line 0 (where user_1 has "10") with the incoming OpB ("20", T2).

LWW Resolution: OpB (T2) wins because T2 > T1.

The merge logic overwrites Line 0 with "20".

Remote op OpD is applied.

user_1's Final State:

Line 0: 20 [MODIFIED] Line 1: 30 Line 2: 40 [MODIFIED]

Result: Both users have converged to the same correct state.

Scenario 3: Known Limitation (Diff Algorithm)

As discussed in the DESIGNDOC, the LWW-per-span model has a fundamental ambiguity with INSERT ops at column 0. This is compounded by the "common prefix/suffix" diff algorithm.

Start user_1 and user_2.

user_1: Edit Line 0 from "Hello" to "ok_Hello".

Observe user_1: The diff algorithm will (incorrectly) detect this as: Change detected: Line 0, col 0-2, "" -> "ok_"

user_1: Make 4 more edits to trigger a broadcast.

Observe user_2:

user_2 receives the (INSERT, "ok_", col 0) operation. The merge logic (which handles INSERT at col 0 as an "overwrite" to prevent state corruption) will apply this.

user_2's Line 0 will be overwritten to become "ok_". The original "Hello" text is lost.

This behavior is a known trade-off, required to prevent the more critical 1020/2010 state corruption bug. It is a fundamental limitation of the LWW-per-span model itself.

About

A decentralized, peer-to-peer collaborative editor built in C++ using CRDTs, POSIX IPC, and lock-free synchronization.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors