Skip to content

[C-API, python-package] Add leveled logging callback to C API and Python bindings#7205

Open
AnyCPU wants to merge 4 commits intolightgbm-org:masterfrom
AnyCPU:feature/leveled_logging
Open

[C-API, python-package] Add leveled logging callback to C API and Python bindings#7205
AnyCPU wants to merge 4 commits intolightgbm-org:masterfrom
AnyCPU:feature/leveled_logging

Conversation

@AnyCPU
Copy link

@AnyCPU AnyCPU commented Mar 17, 2026

Add leveled logging callback to C API and Python bindings

The existing LGBM_RegisterLogCallback sends log messages as raw const char* with no severity metadata.
Each log event arrives as three separate callback invocations (prefix, body, newline), forcing bindings to reassemble them.

The Python package works around this today:

  def _normalize_native_string(func):                                                                                                                                                                       
      """Join log messages from native library which come by chunks."""                                                                                                                                     
      msg_normalized = []
                                                                                                                                                                                                            
      @wraps(func)                                      
      def wrapper(msg):
          nonlocal msg_normalized
          if msg.strip() == "":
              msg = "".join(msg_normalized)
              msg_normalized = []                                                                                                                                                                           
              return func(msg)
          else:                                                                                                                                                                                             
              msg_normalized.append(msg)                
                                                                                                                                                                                                            
      return wrapper                                    

I think this is fragile (relies on an empty-string sentinel to flush) and loses the log level: everything goes to logger.info() regardless of severity.

And I think that other language bindings should not have to replicate this pattern.

This MR adds LGBM_RegisterLogCallbackWithLevel, which delivers (int level, const char* msg): one call per event, no prefix, no trailing newline.

CHANGES

C++ / C API

  • New LeveledCallback dispatch in Log::Write() and Log::Fatal(), priority: leveled -> legacy -> stdout/stderr;
  • LGBM_RegisterLogCallbackWithLevel;
  • LGBM_UnregisterLogCallbackWithLevel;
  • C_API_LOG_LEVEL_FATAL (-1), _WARNING (0), _INFO (1), _DEBUG (2) constants;
  • Fixed missing _TRUNCATE in MSVC vsnprintf_s calls.

PYTHON

  • register_leveled_logger() / unregister_leveled_logger() in basic.py;
  • _log_callback_with_level() routes Fatal->error, Warning->warning, Info->info, Debug->debug;
  • Exception-safe ctypes callback (try/except + warnings.warn — throwing from ctypes into C++ is UB);
  • _DummyLeveledLogger default (stdout for info/debug/warning, stderr for error).

TESTS

  • Validation, unit, integration, and lifecycle tests for all four log levels;
  • End-to-end test triggering Log::Fatal through ctypes.

Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I'll review when I can. Some clarifying questions...

_ forcing bindings to reassemble them._

Could you link to examples where this is happening? I've asked you for that twice before, in these previous conversations:

Seeing those examples would help us to understand how these proposed changes help. We won't take on an expansion of the library's public API (and therefore complexity and maintenance burden) without a better understanding of the benefit.

Validation, unit, integration, and lifecycle tests for all four log levels;

I see only unit tests added in this PR. What do you mean by "integration" and "lifecyle" tests? Are those something you're planning to add but haven't yet?

@AnyCPU
Copy link
Author

AnyCPU commented Mar 17, 2026

@jameslamb

Seeing those examples would help us to understand how these proposed changes help

take a look at python snippet above, that is taken from existing code in the python package.
besides python all language bindings based on C API are required to do something similar because of data contract from C API.

if you are interested in what exactly language bindings I'm working on?
for now Golang.

without reassembly, a naive binding will output something like:

  2026/03/17 14:30:01 INFO [LightGBM] [Warning]
  2026/03/17 14:30:01 INFO There are no meaningful features which satisfy the provided configuration
  2026/03/17 14:30:01 INFO
  2026/03/17 14:30:01 INFO [LightGBM] [Info]
  2026/03/17 14:30:01 INFO Number of positive: 2, number of negative: 2
  2026/03/17 14:30:01 INFO
  2026/03/17 14:30:01 INFO [LightGBM] [Warning]
  2026/03/17 14:30:01 INFO Stopped training because there are no more leaves that meet the split requirements
  2026/03/17 14:30:01 INFO

What do you mean by "integration" and "lifecyle" tests?

  • test_register_leveled_logger_routing runs lgb.train() end-to-end and asserts messages arrive through the full C++ -> ctypes -> Python path (integration);
  • test_unregister_leveled_logger tests register -> unregister -> re-register with thunk and globals cleanup (lifecycle);
  • test_fatal_through_leveled_callback triggers Log::Fatal from C++ through the callback.

happy to change wording if the terminology is confusing.

@AnyCPU AnyCPU requested a review from jameslamb March 18, 2026 11:53
@AnyCPU
Copy link
Author

AnyCPU commented Mar 20, 2026

  • added fixes according to lint reports
  • conda's error looks like a transient network flake + a conda bug (exit code 0 on failure), not related to this mr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants