Skip to content
This repository was archived by the owner on Sep 28, 2023. It is now read-only.
This repository was archived by the owner on Sep 28, 2023. It is now read-only.

takeTill performs exponential backtracking in environments #122

@isovector

Description

@isovector

parseLaTeX performs exponential amounts of work in the number of environments. I have two files, book2.txt and book3.txt which are roughly the same size in bytes, but the latter has ~3x as many environments. book2.txt parses quickly, but I haven't been patient enough to get book3.txt to.

The shiatsu program below is doing nothing more than reading the file and calling parseLaTeX.

➜  /tmp/ebook$ wc book2.txt
  6347  33541 245180 book2.txt

➜  /tmp/ebook$ cat book2.txt | grep "begin" | wc -l
232

➜  /tmp/ebook$ time shiatsu +RTS -p -RTS book2.txt - &> /dev/null
shiatsu +RTS -p -RTS book2.txt - &> /dev/null  2.23s user 0.10s system 99% cpu 2.343 total
➜  /tmp/ebook$ wc book3.txt                                             
  8385  41123 273877 book3.txt

➜  /tmp/ebook$ cat book3.txt | grep "begin" | wc -l
583

➜  /tmp/ebook$ time shiatsu +RTS -p -RTS book3.txt - &> /dev/null
^C
shiatsu +RTS -p -RTS book3.txt - &> /dev/null  43.29s user 0.33s system 99% cpu 43.780 total

Here's the profile information


COST CENTRE      MODULE                 SRC                                         %time %alloc

takeTill         Text.LaTeX.Base.Parser Text/LaTeX/Base/Parser.hs:354:1-48           74.5   76.6
peekChar         Text.LaTeX.Base.Parser Text/LaTeX/Base/Parser.hs:348:1-62            5.2    4.2
cmdArg           Text.LaTeX.Base.Parser Text/LaTeX/Base/Parser.hs:(222,1)-(232,38)    4.9    4.7
dolMath          Text.LaTeX.Base.Parser Text/LaTeX/Base/Parser.hs:(304,1)-(312,7)     2.2    2.7
envBody.endenv   Text.LaTeX.Base.Parser Text/LaTeX/Base/Parser.hs:189:9-75            2.0    2.0
latexBlockParser Text.LaTeX.Base.Parser Text/LaTeX/Base/Parser.hs:(116,1)-(123,5)     1.7    1.0
isSpecial        Text.LaTeX.Base.Parser Text/LaTeX/Base/Parser.hs:335:1-29            1.5    0.0
cmdArgs          Text.LaTeX.Base.Parser Text/LaTeX/Base/Parser.hs:(217,1)-(219,30)    1.4    1.4
envName          Text.LaTeX.Base.Parser Text/LaTeX/Base/Parser.hs:(180,1)-(185,21)    1.2    1.5
text2            Text.LaTeX.Base.Parser Text/LaTeX/Base/Parser.hs:(144,1)-(147,35)    0.8    1.0
env              Text.LaTeX.Base.Parser Text/LaTeX/Base/Parser.hs:(163,1)-(177,22)    0.8    1.1
comment          Text.LaTeX.Base.Parser Text/LaTeX/Base/Parser.hs:(323,1)-(328,23)    0.7    1.1

                                                                                                                                   individual      inherited
COST CENTRE                          MODULE                  SRC                                               no.      entries  %time %alloc   %time %alloc

  parseLaTeX                         Text.LaTeX.Base.Parser  Text/LaTeX/Base/Parser.hs:96:1-45                 2793          0    0.0    0.0   100.0  100.0
   parseLaTeXWith                    Text.LaTeX.Base.Parser  Text/LaTeX/Base/Parser.hs:(99,1)-(101,63)         2794          1    0.0    0.0   100.0  100.0
    latexParser                      Text.LaTeX.Base.Parser  Text/LaTeX/Base/Parser.hs:112:1-57                2796          0    0.0    0.0   100.0  100.0
     latexBlockParser                Text.LaTeX.Base.Parser  Text/LaTeX/Base/Parser.hs:(116,1)-(123,5)         2799          0    1.6    1.0   100.0  100.0
      text                           Text.LaTeX.Base.Parser  Text/LaTeX/Base/Parser.hs:(131,1)-(138,71)        2801          0    0.2    0.2    76.5   75.5
       peekChar                      Text.LaTeX.Base.Parser  Text/LaTeX/Base/Parser.hs:348:1-62                2803          0    3.4    2.5     3.4    2.5
       takeTill                      Text.LaTeX.Base.Parser  Text/LaTeX/Base/Parser.hs:354:1-48                2808          0   67.3   66.8    72.9   72.8
        verbatimEnvironments         Text.LaTeX.Base.Parser  Text/LaTeX/Base/Parser.hs:72:5-24                 3124      14881    0.0    0.0     0.0    0.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions