-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
refactoringCode refactoring and cleanupCode refactoring and cleanup
Description
Overview
A thorough analysis of the entire codebase (~49,000 lines across 72+ source files) identified files and patterns that hinder extensibility and maintainability. This issue serves as the consolidated refactoring roadmap.
Critical — Must Refactor
1. render/typst_gen.rs (1,307 lines) — Monolithic god module
- Image preprocessing, page layout, shape rendering, gradient handling, header/footer generation, tab stop processing — all in one file.
generate_fixed_page()has 3 near-identical#set page(...)branches.GenCtxpassed as&mutthrough ~15 function signatures.- Action: Split into
typst_gen_images.rs,typst_gen_shapes.rs,typst_gen_headers.rs.
2. parser/omml.rs (1,209 lines) — Giant recursive parser
parse_omml_children()is 380 lines with a match on 20+ element types.unicode_to_typst()has 100+ match arms.- Action: Split by math element category (fractions, radicals, matrices, accents, etc.).
3. parser/pptx_text.rs (873 lines) — Oversized list-style parser
parse_pptx_list_style()is 315 lines with 6 inline helper functions and 5 state variables.- 3 near-identical merge functions (
merge_paragraph_style,merge_text_style,merge_pptx_bullet_definition) totaling 69 lines of boilerplate. - Action: Extract state struct, per-element parsers, and
Mergeabletrait/macro.
4. parser/pptx_slides.rs (996 lines) — Complex inheritance chain
parse_single_slide()~140 lines resolving master→layout→slide with 3 repetitiveresolve_effective_color_map()calls.- Action: Simplify inheritance resolution; extract helper functions.
5. render/typst_gen_lists.rs (773 lines) — Duplicated code paths
- Two parallel rendering paths (flex-width vs fixed-width) duplicate the same item loop logic.
- 5 near-identical paragraph attribute comparison helpers.
- Action: Extract
ListItemRendererstruct; deduplicate paths.
6. render/typst_gen_text.rs (670 lines) — Deep nesting
write_run_segment()is 89 lines with 5+ nesting levels and 6 boolean flags for bracket order.- Tab stop processing at 4 nesting levels.
- Action: Extract
TextWrapperstruct; break tab logic into smaller functions.
High — Should Refactor
7. parser/docx.rs (569 lines)
parse()is 183 lines;convert_paragraph_blocks()is 165 lines with 4 nesting levels.- Text extraction logic duplicated 3 times (lines ~479-527).
- Action: Factor
convert_paragraph_blocks()into smaller functions; deduplicate text extraction.
8. parser/cond_fmt.rs (411 lines)
build_cond_fmt_overrides()is 244 lines with 3+ nesting levels.- No polymorphism for rule types (CellIs, ColorScale, DataBar, IconSet all inline).
- Action: Extract
ConditionalRuletrait withevaluate()/apply_style().
9. lib_pipeline.rs (381 lines)
- Excessive
#[cfg]duplication between native and WASM targets (4 distinct code paths for nearly identical logic). - Unsafe
.unwrap()at line 328 onall_pdfs.into_iter().next(). dedup_warnings()called manually in 4 places.- Action: Extract platform trait; replace
.unwrap()with.expect()orResult; centralize warning dedup.
10. render/font_subst.rs (410 lines)
- Thread-local
RefCell<Option<FontSearchContext>>— mutable global state. - 4 duplicate recursive tree-walking functions:
collect_paragraph_fonts(),collect_table_fonts(),collect_header_footer_fonts(),collect_block_fonts(). - Action: Replace thread-local with parameter passing; introduce visitor pattern for tree walking.
11. render/pdf.rs (376 lines)
compile_to_pdf()andcompile_to_pdf_wasm()are nearly identical.- Duplicate image map construction in
MinimalWorld::new()variants. - Action: Unify into single function with
#[cfg]branch at world creation only.
12. render/typst_gen_tables.rs (318 lines)
generate_table_rows()56 lines with 3-level nested rowspan tracking.- Cell content generation duplicates
generate_block()match logic. - Action: Extract
RowspanTrackerstruct; reusegenerate_block().
13. pdf_ops.rs (250 lines)
- Unsafe
.unwrap()onas_dict_mut()(line 145). merge()andsplit()share duplicate page validation.- Action: Fix panics; extract common PDF document handling.
Medium — Worth Considering
| File | Lines | Issue |
|---|---|---|
parser/pptx_theme.rs |
717 | Color parsing could share code with docx_text.rs |
parser/docx_sections.rs |
456 | build_flow_page_from_section() exceeds 50 lines |
parser/docx_tables.rs |
460 | Silent truncation at MAX_TABLE_DEPTH = 64 (no warning logged) |
parser/chart.rs |
447 | 8 chart types matched with near-identical parse calls; should loop |
parser/xlsx.rs |
262 | parse() and parse_streaming() duplicate shared structure |
cli/server.rs |
306 | Manual multipart boundary parsing; fragile string slicing |
cli/metrics.rs |
285 | .lock().unwrap() appears 12+ times |
Cross-Cutting Duplication Patterns
| Pattern | Locations | Proposed Fix |
|---|---|---|
| Style merging boilerplate | docx_text.rs, docx_styles.rs, pptx_text.rs (3 near-identical merge functions) |
Mergeable trait or macro |
| Unit conversions | 12700 (EMU→pt) in 3 files, 20.0 (twips→pt) in 4 files |
Shared constants EMU_PER_PT, TWIPS_PER_PT |
| Color parsing | docx_text.rs, docx_tables.rs, cond_fmt.rs, pptx_theme.rs |
Consolidate into xml_util.rs |
| Font tree walking | font_subst.rs — 4 functions with identical recursive pattern |
Visitor pattern or generic walker |
#[cfg] branching |
lib_pipeline.rs, pdf.rs |
Platform trait abstraction |
Files That Need No Refactoring
config.rs,error.rs— clean and well-structuredir/module — clean intermediate representationparser/mod.rs,parser/xml_util.rs,parser/metadata.rs— simple and focusedparser/docx_contexts.rs+ 7 context submodules — good pattern-reduction- All
*_tests.rsfiles — comprehensive and well-named
Recommended Priority Order
- Split
typst_gen.rsinto sub-modules (highest impact) - Extract shared unit conversion constants (quick win)
- Create
Mergeabletrait/macro for style merging (~100 lines saved) - Split
parse_omml_children()by math element category - Extract
parse_pptx_list_style()into state struct + per-element parsers - Consolidate font tree walkers with visitor pattern
- Unify
#[cfg]paths inlib_pipeline.rsandpdf.rs - Fix unsafe
.unwrap()calls inlib_pipeline.rsandpdf_ops.rs
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
refactoringCode refactoring and cleanupCode refactoring and cleanup