Skip to content

Commit f0266cb

Browse files
authored
[decimal] Re-implement the true_divide() function for BigDecimal (#110)
This pull request introduces enhancements to the BigDecimal division functionality, including refactoring of the `true_divide` function and the implementation of two new division methods: `true_divide_fast` and `true_divide_general`. These changes aim to improve precision, performance, and code maintainability. ### Division Function Refactoring: * Refactored the `true_divide` function to simplify logic and delegate division tasks to specialized helper functions (`true_divide_fast` and `true_divide_general`). This improves readability and separates concerns. ### New Division Methods: * **`true_divide_fast`**: Introduced a high-performance division method that avoids rounding and trailing zero removal, suitable for cases where speed is prioritized over formatting. * **`true_divide_general`**: Added a precise division method that rounds results to the specified precision and removes unnecessary trailing zeros, ensuring compliance with exact division requirements. ### Code Simplification: * Replaced unused imports (`time` and `testing`) with `math` in `arithmetics.mojo`, aligning dependencies with the updated implementation. ### Benchmark Enhancements: * Added new benchmark cases (Cases 57–64) to test division of BigDecimal numbers with varying word sizes, including extremely large numbers. These cases use the newly introduced `ITERATIONS_LARGE_NUMBERS` constant for iterations.
1 parent 7a5101b commit f0266cb

File tree

3 files changed

+231
-105
lines changed

3 files changed

+231
-105
lines changed

benches/bigdecimal/bench_bigdecimal_divide.mojo

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ from collections import List
1212

1313
alias PRECISION = 4096
1414
alias ITERATIONS = 100
15+
alias ITERATIONS_LARGE_NUMBERS = 3
1516

1617

1718
fn open_log_file() raises -> PythonObject:
@@ -758,6 +759,86 @@ fn main() raises:
758759
speedup_factors,
759760
)
760761

762+
# Case 57: Division 65536 words / 65536 words
763+
run_benchmark_divide(
764+
"Division 65536 words / 65536 words",
765+
"123456789" * 32768 + "." + "123456789" * 32768,
766+
"987654321" * 32768 + "." + "987654321" * 32768,
767+
iterations,
768+
log_file,
769+
speedup_factors,
770+
)
771+
772+
# Case 58: Division 262144 words / 262144 words
773+
run_benchmark_divide(
774+
"Division 262144 words / 262144 words",
775+
"123456789" * 131072 + "." + "123456789" * 131072,
776+
"987654321" * 131072 + "." + "987654321" * 131072,
777+
ITERATIONS_LARGE_NUMBERS,
778+
log_file,
779+
speedup_factors,
780+
)
781+
782+
# Case 59: Division 65536 words / 32768 words
783+
run_benchmark_divide(
784+
"Division 65536 words / 32768 words",
785+
"123456789" * 32768 + "." + "123456789" * 32768,
786+
"987654321" * 16384 + "." + "987654321" * 16384,
787+
ITERATIONS_LARGE_NUMBERS,
788+
log_file,
789+
speedup_factors,
790+
)
791+
792+
# Case 60: Division 65536 words / 16384 words
793+
run_benchmark_divide(
794+
"Division 65536 words / 16384 words",
795+
"123456789" * 16384 + "." + "123456789" * 16384,
796+
"987654321" * 8192 + "." + "987654321" * 8192,
797+
ITERATIONS_LARGE_NUMBERS,
798+
log_file,
799+
speedup_factors,
800+
)
801+
802+
# Case 61: Division 65536 words / 8192 words
803+
run_benchmark_divide(
804+
"Division 65536 words / 8192 words",
805+
"123456789" * 8192 + "." + "123456789" * 8192,
806+
"987654321" * 4096 + "." + "987654321" * 4096,
807+
ITERATIONS_LARGE_NUMBERS,
808+
log_file,
809+
speedup_factors,
810+
)
811+
812+
# Case 62: Division 65536 words / 4096 words
813+
run_benchmark_divide(
814+
"Division 65536 words / 4096 words",
815+
"123456789" * 4096 + "." + "123456789" * 4096,
816+
"987654321" * 2048 + "." + "987654321" * 2048,
817+
ITERATIONS_LARGE_NUMBERS,
818+
log_file,
819+
speedup_factors,
820+
)
821+
822+
# Case 63: Division 65536 words / 2048 words
823+
run_benchmark_divide(
824+
"Division 65536 words / 2048 words",
825+
"123456789" * 2048 + "." + "123456789" * 2048,
826+
"987654321" * 1024 + "." + "987654321" * 1024,
827+
ITERATIONS_LARGE_NUMBERS,
828+
log_file,
829+
speedup_factors,
830+
)
831+
832+
# Case 64: Division 65536 words / 1024 words
833+
run_benchmark_divide(
834+
"Division 65536 words / 1024 words",
835+
"123456789" * 1024 + "." + "123456789" * 1024,
836+
"987654321" * 512 + "." + "987654321" * 512,
837+
ITERATIONS_LARGE_NUMBERS,
838+
log_file,
839+
speedup_factors,
840+
)
841+
761842
# Calculate average speedup factor (ignoring any cases that might have failed)
762843
if len(speedup_factors) > 0:
763844
var sum_speedup: Float64 = 0.0

docs/todo.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ This is a to-do list for DeciMojo.
55
- [ ] When Mojo supports global variables, implement a global variable for the `BigDecimal` class to store the precision of the decimal number. This will allow users to set the precision globally, rather than having to set it for each function of the `BigDecimal` class.
66
- [ ] Implement different methods for adding decimojo types with `Int` types so that an implicit conversion is not required.
77
- [ ] Use debug mode to check for unnecessary zero words before all arithmetic operations. This will help ensure that there are no zero words, which can simplify the speed of checking for zero because we only need to check the first word.
8+
- [ ] Check the `floor_divide()` function of `BigUInt`. Currently, the speed of division between imilar-sized numbers are okay, but the speed of 2n-by-n, 4n-by-n, and 8n-by-n divisions decreases unproportionally. This is likely due to the segmentation of the dividend in the Burnikel-Ziegler algorithm.
89

910
- [x] (#31) The `exp()` function performs slower than Python's counterpart in specific cases. Detailed investigation reveals the bottleneck stems from multiplication operations between decimals with significant fractional components. These operations currently rely on UInt256 arithmetic, which introduces performance overhead. Optimization of the `multiply()` function is required to address these performance bottlenecks, particularly for high-precision decimal multiplication with many digits after the decimal point.
1011
- [x] Implement different methods for augmented arithmetic assignments to improve memeory-efficiency and performance.

0 commit comments

Comments
 (0)