Skip to content

Conversation

@juandiegopalomino
Copy link
Collaborator

Description

Contains the first version of the meister profiler/decorator. This is just a small package with a decorator, meister_timer, being the only thing (besides the config vars) someone should be setting. What this decorator does is simple: when a decorated function is called (works for sync and async), it prints the time it took and keeps track of the call's metadata (start time, end time, calling line, etc...) and at the end uses it to report a list of the top longest calls as well as a color-coded bar graph representing the rough times
Screen Shot 2022-01-03 at 1 30 03 PM

Safety checklist

  • This change is backwards compatible and safe to apply by existing users
  • This change will NOT lead to data loss
  • This change will NOT lead to downtime who already has an env/service setup

How has this change been tested, beside unit tests?

Ran it locally a bunch

Copy link
Collaborator

@NitinAgg NitinAgg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worried about adding this decorator everywhere. Would it be possible to add a visualizer on top of cprofile (context: https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script)

opta/cli.py Outdated
# after the command.
# However, we should still clean them up before the next command, or
# else it may interfere with it.
meister.MEISTER_REPORT_ENABLED = VERSION == DEV_VERSION
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to have a separate flag for this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or an env var

Copy link
Contributor

@bigbitbus bigbitbus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could create another repo for meister under runx and market it as a tool for the community - it will get some traffic to runx (although it will be generic python perf searches rather than IaC only). Who knows, it might just take off!

@codecov
Copy link

codecov bot commented Jan 3, 2022

Codecov Report

Merging #561 (9f99a69) into main (bae4538) will increase coverage by 0.33%.
The diff coverage is 81.87%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #561      +/-   ##
==========================================
+ Coverage   73.15%   73.48%   +0.33%     
==========================================
  Files         100      101       +1     
  Lines        6678     6973     +295     
==========================================
+ Hits         4885     5124     +239     
- Misses       1793     1849      +56     
Flag Coverage Δ
unittests 73.48% <81.87%> (+0.33%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
opta/meister.py 50.92% <50.92%> (ø)
opta/cli.py 71.64% <66.66%> (-0.24%) ⬇️
opta/commands/apply.py 80.76% <100.00%> (+0.43%) ⬆️
opta/commands/destroy.py 52.84% <100.00%> (+2.41%) ⬆️
opta/commands/local_flag.py 87.50% <100.00%> (+1.29%) ⬆️
opta/commands/output.py 94.44% <100.00%> (+0.50%) ⬆️
opta/commands/push.py 77.27% <100.00%> (+2.27%) ⬆️
opta/core/aws.py 55.49% <100.00%> (+5.79%) ⬆️
opta/core/azure.py 87.34% <100.00%> (+1.04%) ⬆️
opta/core/gcp.py 53.03% <100.00%> (+4.26%) ⬆️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bae4538...9f99a69. Read the comment docs.

@juandiegopalomino
Copy link
Collaborator Author

juandiegopalomino commented Jan 3, 2022

@NitinAgg believe me, I truly wish you were right, and doing something like that was the first thing I tried. Unfortunately I abandoned that approach because

  1. Usually the profiler needed to be run as a wrapping command or weird flag
  2. It did not give you the important metadata (e.g. start time, end time, caller, etc...)
  3. (and this is what really killed it), those profilers checked everything. Every call to imported function, and every line of code. It was an overkill and trying to parse the data actually slowed down opta to a snail crawl

I agree with you that having it as a decorator in every function is tedious and it was a defeat I'm not happy with, but I couldn't think of a way around it. Obviously there has to be a way to get this, but I was way down the rabbit hole and still no sight of the solution so I cut my losses. I don't think it was a major setback as we could in the end get more succinct, "quick" metadata which we wanted for the purpose of increasing opta's speed. But if you do find a way to simply have a "magic one line that sets it all up" pls pls pls let me knw

@juandiegopalomino
Copy link
Collaborator Author

@bigbitbus Yes we could, but I want to let it "incubate" a bit in opta to see if it truly is useful.

Copy link
Collaborator

@rabbitfang rabbitfang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a cool feature. I wonder if we could find a way to automatically annotate functions that opta defines.

Pairing up the function colors with the lanes below might be hard to do. A suggestion would be to print the function's fully qualified name (or part of it; includes module name) inside of the bar graphs in white.

opta/meister.py Outdated
MEISTER_START_TIME = time.time()


def meister_time(func: Callable) -> Callable:
Copy link
Collaborator

@rabbitfang rabbitfang Jan 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I took a stab at removing all of the duplication of this function, and I was semi-successful, but I don't know if it actually reads better (its also untested):

T = TypeVar('T')
TCallable = TypeVar('TCallable', bound=Callable)
U = TypeVar('U')

def profile(func: TCallable[T, U]) -> TCallable[T, U]:
    def time_func(*args: Any, **kwargs: Any) -> None:
        if not MEISTER_REPORT_ENABLED:
            # Skip over the expensive stack inspection and storing of func args if this isn't enabled
            yield
            return

        frame_info = inspect.stack()[1]
        start_time = time.time()
        yield
        end_time = time.time()
        if end_time - start_time <= MEISTER_TIME_THRESHOLD:
            return

        MEISTER_RECORDS.append(
            {
                "function_name": func.__name__,
                "positional_args": args,
                "keyword_args": kwargs,
                "start_time": start_time,
                "end_time": end_time,
                "id": uuid4(),
                "caller_file": "/".join(frame_info.filename.split("/")[-2:]),
                "caller_line": frame_info.lineno,
            }
        )
        print(
            f"{fg('green')}Meister{attr(0)}: Function {func.__name__} took {end_time - start_time} seconds"
        )

    if iscoroutinefunction(func):
        @wraps(func)
        async def wrapper(*args: Any, **kwargs: Any) -> Any:
            timer = time_func(*args, **kwargs)
            next(timer) # Start timer

            output = await func(*args, **kwargs)

            try:
                next(timer) # Complete the timing
            except StopIteration:
                pass

            return output
    else:
        @wraps(func)
        def wrapper(*args: Any, **kwargs: Any) -> Any:
            timer = time_func(*args, **kwargs)
            next(timer) # Start timer

            output = func(*args, **kwargs)

            try:
                next(timer) # Complete the timing
            except StopIteration:
                pass

            return output


    return wrapper

Used a different function name, so usage changes to (but change in name is unimportant):

from opta import meister

@meister.profile
def foo():
   ...

opta/cli.py Outdated
# after the command.
# However, we should still clean them up before the next command, or
# else it may interfere with it.
meister.MEISTER_REPORT_ENABLED = VERSION == DEV_VERSION
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or an env var

@bigbitbus
Copy link
Contributor

Since the decorator is being added to most class methods, we could look at an automated way of adding this to the callables in a class (e.g. https://stackoverflow.com/questions/6307761/how-to-decorate-all-functions-of-a-class-without-typing-it-over-and-over-for-eac); that way we won't make so many code changes and it will be relatively easy to comment/uncomment this for code clarity.

@NitinAgg
Copy link
Collaborator

NitinAgg commented Jan 4, 2022

3. (and this is what really killed it), those profilers checked everything. Every call to imported function, and every line of code. It was an overkill and trying to parse the data actually slowed down opta to a snail crawl

@bigbitbus I was also looking into this. but unfortunately we don't use classes that much and even if we do we would still have to add the decorator to each class.

Copy link
Collaborator

@rabbitfang rabbitfang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor change. Otherwise, LGTM

ASYNC_LOCK: Final = Lock()
REPORT_SIZE: Final = 20
TIME_THRESHOLD: Final = 0.1
REPORT_ENABLED: bool = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
REPORT_ENABLED: bool = True
REPORT_ENABLED: bool = False

This will avoid measuring functions when running unit tests.

Contains the first version of the meister profiler/decorator. This is just a small package with a decorator, meister_timer, being the only thing (besides the config vars) someone should be setting. What this decorator does is simple: when a decorated function is called (works for sync and async), it prints the time it took and keeps track of the call's metadata (start time, end time, calling line, etc...) and at the end uses it to report a list of the top longest calls as well as a color-coded bar graph representing the rough times

testfixing


Update opta/meister.py

Co-authored-by: Patrick Fiedler <rabbitfang@users.noreply.github.com>
Update opta/cli.py

Co-authored-by: Patrick Fiedler <rabbitfang@users.noreply.github.com>
Update opta/meister.py

Co-authored-by: Patrick Fiedler <rabbitfang@users.noreply.github.com>
Update opta/meister.py

Co-authored-by: Patrick Fiedler <rabbitfang@users.noreply.github.com>
Update opta/meister.py

Co-authored-by: Patrick Fiedler <rabbitfang@users.noreply.github.com>
Update opta/meister.py

Co-authored-by: Patrick Fiedler <rabbitfang@users.noreply.github.com>
Update opta/meister.py

Co-authored-by: Patrick Fiedler <rabbitfang@users.noreply.github.com>
Update opta/meister.py

Co-authored-by: Patrick Fiedler <rabbitfang@users.noreply.github.com>
bug fixing


addressing cr


typo


bug fixing


bugfixing


wip
Copy link
Collaborator

@NitinAgg NitinAgg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment

# after the command.
# However, we should still clean them up before the next command, or
# else it may interfere with it.
meister.REPORT_ENABLED = VERSION in [DEV_VERSION, ""]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this based on an env flag instead. I don't want outside contributors to run into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants