Skip to content

Comments

compare-the-speed-of-grep-with-python-regex.txt: Clean up and expand#119

Open
mfwitten wants to merge 13 commits intoszabgab:mainfrom
mfwitten:0000-comparison-grep-python
Open

compare-the-speed-of-grep-with-python-regex.txt: Clean up and expand#119
mfwitten wants to merge 13 commits intoszabgab:mainfrom
mfwitten:0000-comparison-grep-python

Conversation

@mfwitten
Copy link

This patch series does the following:

  • Fixes typos and grammar.

  • Makes comparisons more consistent.

  • Constructs optimized Python code in steps.

However:

  • I have not tested the rendering of the altered page; I only edited the source file.

The git history has already been designed to clearly delineate this patch:

$ git log --oneline --graph 71596046^..
*   e5d6ef96 compare-the-speed-of-grep-with-python-regex.txt: Clean up and expand
|\  
| * 6128a670 Update the grep version to what I'm using
| * 89b487b5 Update the examples of a regular expression that is more complex
| * 9020ef82 Add output from test command
| * 7fdc142b grep_speed*.py: Match the bash code better at first
| * 69e72c73 typo/grammar
| * 6400c709 grammar
| * 514dcc77 better punctuation: s/,/;/
| * ebdfe43b typo/grammar
| * 3ae3b21e typo/grammar
| * 28ddbcf9 typo/grammar
| * 5b3a0bb1 Move the command lines that create and verify the test input
| * 18e2b1fc typo/grammar
|/  
* 71596046 update

You may want to update the Perl version of this page, which is what prompted this work in the first place; if I recall correctly, perl tries to compile regular expressions only once when possible, and so you may be comparing apples and oranges there.

In addition, this is a one-off PR; I will not be making any improvements to my patch series, so if you want to use my work, please apply this series and then add your changes on top as you see fit.

While reading the article, this is how I wish it had been presented.
* For a better initial comparison, the pyhon code has been
  altered to match the bash code as closely as possible; a
  function named 'grep' is used to simulate a call to the
  program 'grep'.

* Next, a couple of a new Python code files have been
  introduced:

    examples/grep_speed-open-once.py
    examples/grep_speed-optimized.py

  In particular:

    * grep_speed-open-once.py: This simply alters the Python
      code to ensure that the file is opened only once, using
      'seek' to return to the beginning of the file on each
      iteration.

    * grep_speed-optimized.py: In addition, this one compiles
      the regular expression only once, which significantly
      improves the performance, but the results are still
      very slow.

* The article text has been updated to reflect these changes;
  I also used my timing results to make sure that things are
  consistent.
* The optimized Python code is discussed initially.

* The unoptimized Python code is discussed next.

* The timing has been updated to reflect what I'm seeing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant