Skip to content

[Bug] Heap-buffer-overflow in opencc::MaxMatchSegmentation::Segment #997

@oneafter

Description

@oneafter

Description

We discovered a Heap Buffer Overflow (Read) vulnerability in OpenCC. The application crashes with a heap-buffer-overflow error when processing a specifically crafted string. The crash occurs within opencc::MaxMatchSegmentation::Segment, where the code attempts to read 1 byte past the end of the allocated string buffer.

Environment

  • OS: Linux x86_64
  • Complier: Clang
  • Build Configuration: Release mode with ASan enabled.

Vulnerability Details

  • Target: OpenCC
  • Crash Type: Heap-buffer-overflow (READ of size 1)
  • Source File: src/MaxMatchSegmentation.cpp
  • Function: opencc::MaxMatchSegmentation::Segment
  • Line Number: 34 (Column 41)
  • Root Cause Analysis: The ASAN report indicates a read of size 1 at an address immediately following a 19-byte allocated region (0x503000a625c3 is 0 bytes after ...5c3). The stack trace shows the call path: SimpleConverter::Convert -> MaxMatchSegmentation::Segment. It appears that the segmentation logic at line 34 iterates through the input string (likely handling multibyte UTF-8 characters) but fails to correctly check the boundary condition, causing it to access buffer[length] or similar invalid index.

Reproduce

  1. Build OpenCC with Release optimization and ASAN enabled.
  2. Compile the OpenCC harness with AddressSanitizer enabled (-fsanitize=address -g)
  3. Run with the crashing file repro:
./harness < repro

ASAN report

==49077==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x503000a625c3 at pc 0x562ffa32281e bp 0x7fffacc40bf0 sp 0x7fffacc40be8
READ of size 1 at 0x503000a625c3 thread T0
    #0 0x562ffa32281d in opencc::MaxMatchSegmentation::Segment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) const /src/OpenCC/src/MaxMatchSegmentation.cpp:34:41
    #1 0x562ffa2bd721 in opencc::Converter::Convert(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) const /src/OpenCC/src/Converter.cpp:28:47
    #2 0x562ffa2bd0d3 in opencc::SimpleConverter::Convert(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) const /src/OpenCC/src/SimpleConverter.cpp:88:29
    #3 0x562ffa2b889b in main /src/OpenCC/build/../opencc_harness.cpp:37:49
    #4 0x7fb25fa021c9 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #5 0x7fb25fa0228a in __libc_start_main csu/../csu/libc-start.c:360:3
    #6 0x562ffa1cf794 in _start (/src/OpenCC/build/opencc_harness+0xa5794) (BuildId: 3cf08591ed6f6879b7ce8d25bc6889b222bda64c)

0x503000a625c3 is located 0 bytes after 19-byte region [0x503000a625b0,0x503000a625c3)
allocated by thread T0 here:
    #0 0x562ffa2b0cc1 in operator new(unsigned long) (/src/OpenCC/build/opencc_harness+0x186cc1) (BuildId: 3cf08591ed6f6879b7ce8d25bc6889b222bda64c)
    #1 0x562ffa2b8783 in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_construct<char const*>(char const*, char const*, std::forward_iterator_tag) /usr/lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/basic_string.tcc:229:14
    #2 0x562ffa2b8783 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::basic_string(char const*, unsigned long, std::allocator<char> const&) /usr/lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/basic_string.h:627:2
    #3 0x562ffa2b8783 in main /src/OpenCC/build/../opencc_harness.cpp:33:25
    #4 0x7fb25fa021c9 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #5 0x7fb25fa0228a in __libc_start_main csu/../csu/libc-start.c:360:3
    #6 0x562ffa1cf794 in _start (/src/OpenCC/build/opencc_harness+0xa5794) (BuildId: 3cf08591ed6f6879b7ce8d25bc6889b222bda64c)

SUMMARY: AddressSanitizer: heap-buffer-overflow /src/OpenCC/src/MaxMatchSegmentation.cpp:34:41 in opencc::MaxMatchSegmentation::Segment(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) const
Shadow bytes around the buggy address:
  0x503000a62300: fa fa fd fd fd fd fa fa 00 00 00 fa fa fa 00 00
  0x503000a62380: 00 fa fa fa fd fd fd fd fa fa 00 00 00 00 fa fa
  0x503000a62400: 00 00 00 00 fa fa 00 00 00 fa fa fa 00 00 00 fa
  0x503000a62480: fa fa fd fd fd fd fa fa 00 00 00 fa fa fa fd fd
  0x503000a62500: fd fd fa fa 00 00 00 00 fa fa 00 00 00 fa fa fa
=>0x503000a62580: 00 00 00 fa fa fa 00 00[03]fa fa fa 00 00 00 fa
  0x503000a62600: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x503000a62680: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x503000a62700: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x503000a62780: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x503000a62800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==49077==ABORTING

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions