forked from matiasb/python-unidiff
-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
The path() method (getter) in PatchedFile class has the code that tries to handle quoted filenames:
quoted = filepath.startswith('"') and filepath.endswith('"')
if quoted:
filepath = filepath[1:-1]
if RE_PATCH_FILE_PREFIX.match(filepath):
filepath = filepath[2:]
if quoted:
filepath = '"{}"'.format(filepath)
return filepathHowever, if path() was meant to give the file name as it exist in the filesystem in the repository, this is not enough. The code simply strips file prefix, and re-wraps the result in quotes (e.g. extracting "name with \"quotes\"" from "a/name with \"quotes\"", and file from a/file).
Should unidiff decode the quoted path, or provide a separate mechanism to decode a c-quoted path that git diff uses?
Here is my code that actually tries to decode c-quoted filename:
def decode_c_quoted_str(text: str) -> str:
"""C-style name unquoting
See unquote_c_style() function in 'quote.c' file in git/git source code
https://github.com/git/git/blob/master/quote.c#L401
This is subset of escape sequences supported by C and C++
https://learn.microsoft.com/en-us/cpp/c-language/escape-sequences
"""
escape_dict = {
'a': '\a', # Bell (alert)
'b': '\b', # Backspace
'f': '\f', # Form feed
'n': '\n', # New line
'r': '\r', # Carriage return
't': '\t', # Horizontal tab
'v': '\v', # Vertical tab
}
quoted = text.startswith('"') and text.endswith('"')
if quoted:
text = text[1:-1] # remove quotes
buf = bytearray()
escaped = False
oct_str = ''
for ch in text:
if not escaped:
if ch != '\\':
buf.append(ord(ch))
else:
escaped = True
oct_str = ''
else:
if ch in ('"', '\\'):
buf.append(ord(ch))
escaped = False
elif ch in escape_dict:
buf.append(ord(escape_dict[ch]))
escaped = False
elif '0' <= ch <= '7': # octal values with first digit over 4 overflow
oct_str += ch
if len(oct_str) == 3:
byte = int(oct_str, base=8) # byte in octal notation
if byte > 256:
raise ValueError(f'Invalid octal escape sequence \\{oct_str} in "{text}"')
buf.append(byte)
escaped = False
oct_str = ''
else:
raise ValueError(f'Unexpected character \'{ch}\' in escape sequence when parsing "{text}"')
if escaped:
raise ValueError(f'Unfinished escape sequence when parsing "{text}"')
text = buf.decode(errors=ENCODING_ERRORS)
return textReactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels