Skip to content

Implement model quantization#84

Merged
mlsw merged 2 commits intomainfrom
model-quantization
Mar 4, 2025
Merged

Implement model quantization#84
mlsw merged 2 commits intomainfrom
model-quantization

Conversation

@mlsw
Copy link
Collaborator

@mlsw mlsw commented Feb 26, 2025

This PR implements W8A8 model quantization for the TransformersModelForTokenClassificationNerStep. As this is an experimental feature, it is currently gated by environment variables.

@mlsw mlsw force-pushed the model-quantization branch from 4c485e3 to 4019dbe Compare February 26, 2025 12:39
@mlsw mlsw marked this pull request as ready for review February 26, 2025 12:47
@mlsw mlsw requested a review from paluchasz February 26, 2025 12:47
Copy link
Collaborator

@paluchasz paluchasz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mlsw mlsw force-pushed the model-quantization branch from 766d076 to 74132dd Compare March 4, 2025 11:34
@paluchasz paluchasz self-requested a review March 4, 2025 13:40
@mlsw mlsw merged commit ea14f6c into main Mar 4, 2025
3 checks passed
@mlsw mlsw deleted the model-quantization branch March 4, 2025 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants