[decimal] Re-implement the true_divide() function for BigDecimal (#110)

forfudan · web-flow · commit f0266cb647de · 2025-07-18T22:37:08.000+02:00
This pull request introduces enhancements to the BigDecimal division
functionality, including refactoring of the `true_divide` function and
the implementation of two new division methods: `true_divide_fast` and
`true_divide_general`. These changes aim to improve precision,
performance, and code maintainability.

### Division Function Refactoring:
* Refactored the `true_divide` function to simplify logic and delegate
division tasks to specialized helper functions (`true_divide_fast` and
`true_divide_general`). This improves readability and separates
concerns.

### New Division Methods:
* **`true_divide_fast`**: Introduced a high-performance division method
that avoids rounding and trailing zero removal, suitable for cases where
speed is prioritized over formatting.
* **`true_divide_general`**: Added a precise division method that rounds
results to the specified precision and removes unnecessary trailing
zeros, ensuring compliance with exact division requirements.

### Code Simplification:
* Replaced unused imports (`time` and `testing`) with `math` in
`arithmetics.mojo`, aligning dependencies with the updated
implementation.

### Benchmark Enhancements:
* Added new benchmark cases (Cases 57–64) to test division of BigDecimal
numbers with varying word sizes, including extremely large numbers.
These cases use the newly introduced `ITERATIONS_LARGE_NUMBERS` constant
for iterations.
diff --git a/benches/bigdecimal/bench_bigdecimal_divide.mojo b/benches/bigdecimal/bench_bigdecimal_divide.mojo
@@ -12,6 +12,7 @@ from collections import List
 
 alias PRECISION = 4096
 alias ITERATIONS = 100
+alias ITERATIONS_LARGE_NUMBERS = 3
 
 
 fn open_log_file() raises -> PythonObject:
@@ -758,6 +759,86 @@ fn main() raises:
         speedup_factors,
     )
 
+    # Case 57: Division 65536 words / 65536 words
+    run_benchmark_divide(
+        "Division 65536 words / 65536 words",
+        "123456789" * 32768 + "." + "123456789" * 32768,
+        "987654321" * 32768 + "." + "987654321" * 32768,
+        iterations,
+        log_file,
+        speedup_factors,
+    )
+
+    # Case 58: Division 262144 words / 262144 words
+    run_benchmark_divide(
+        "Division 262144 words / 262144 words",
+        "123456789" * 131072 + "." + "123456789" * 131072,
+        "987654321" * 131072 + "." + "987654321" * 131072,
+        ITERATIONS_LARGE_NUMBERS,
+        log_file,
+        speedup_factors,
+    )
+
+    # Case 59: Division 65536 words / 32768 words
+    run_benchmark_divide(
+        "Division 65536 words / 32768 words",
+        "123456789" * 32768 + "." + "123456789" * 32768,
+        "987654321" * 16384 + "." + "987654321" * 16384,
+        ITERATIONS_LARGE_NUMBERS,
+        log_file,
+        speedup_factors,
+    )
+
+    # Case 60: Division 65536 words / 16384 words
+    run_benchmark_divide(
+        "Division 65536 words / 16384 words",
+        "123456789" * 16384 + "." + "123456789" * 16384,
+        "987654321" * 8192 + "." + "987654321" * 8192,
+        ITERATIONS_LARGE_NUMBERS,
+        log_file,
+        speedup_factors,
+    )
+
+    # Case 61: Division 65536 words / 8192 words
+    run_benchmark_divide(
+        "Division 65536 words / 8192 words",
+        "123456789" * 8192 + "." + "123456789" * 8192,
+        "987654321" * 4096 + "." + "987654321" * 4096,
+        ITERATIONS_LARGE_NUMBERS,
+        log_file,
+        speedup_factors,
+    )
+
+    # Case 62: Division 65536 words / 4096 words
+    run_benchmark_divide(
+        "Division 65536 words / 4096 words",
+        "123456789" * 4096 + "." + "123456789" * 4096,
+        "987654321" * 2048 + "." + "987654321" * 2048,
+        ITERATIONS_LARGE_NUMBERS,
+        log_file,
+        speedup_factors,
+    )
+
+    # Case 63: Division 65536 words / 2048 words
+    run_benchmark_divide(
+        "Division 65536 words / 2048 words",
+        "123456789" * 2048 + "." + "123456789" * 2048,
+        "987654321" * 1024 + "." + "987654321" * 1024,
+        ITERATIONS_LARGE_NUMBERS,
+        log_file,
+        speedup_factors,
+    )
+
+    # Case 64: Division 65536 words / 1024 words
+    run_benchmark_divide(
+        "Division 65536 words / 1024 words",
+        "123456789" * 1024 + "." + "123456789" * 1024,
+        "987654321" * 512 + "." + "987654321" * 512,
+        ITERATIONS_LARGE_NUMBERS,
+        log_file,
+        speedup_factors,
+    )
+
     # Calculate average speedup factor (ignoring any cases that might have failed)
     if len(speedup_factors) > 0:
         var sum_speedup: Float64 = 0.0
diff --git a/docs/todo.md b/docs/todo.md
@@ -5,6 +5,7 @@ This is a to-do list for DeciMojo.
 - [ ] When Mojo supports global variables, implement a global variable for the `BigDecimal` class to store the precision of the decimal number. This will allow users to set the precision globally, rather than having to set it for each function of the `BigDecimal` class.
 - [ ] Implement different methods for adding decimojo types with `Int` types so that an implicit conversion is not required.
 - [ ] Use debug mode to check for unnecessary zero words before all arithmetic operations. This will help ensure that there are no zero words, which can simplify the speed of checking for zero because we only need to check the first word.
+- [ ] Check the `floor_divide()` function of `BigUInt`. Currently, the speed of division between imilar-sized numbers are okay, but the speed of 2n-by-n, 4n-by-n, and 8n-by-n divisions decreases unproportionally. This is likely due to the segmentation of the dividend in the Burnikel-Ziegler algorithm.
 
 - [x] (#31) The `exp()` function performs slower than Python's counterpart in specific cases. Detailed investigation reveals the bottleneck stems from multiplication operations between decimals with significant fractional components. These operations currently rely on UInt256 arithmetic, which introduces performance overhead. Optimization of the `multiply()` function is required to address these performance bottlenecks, particularly for high-precision decimal multiplication with many digits after the decimal point.
 - [x] Implement different methods for augmented arithmetic assignments to improve memeory-efficiency and performance.
diff --git a/src/decimojo/bigdecimal/arithmetics.mojo b/src/decimojo/bigdecimal/arithmetics.mojo