Skip to content

A few suggestions #1

@leafyoung

Description

@leafyoung

A few suggestions

  1. replace all print with logging.
        print(f"Configuration loaded successfully from {config_path}") # this can be a logger.info
        return config 
    except FileNotFoundError:
        print(f"ERROR: Configuration file not found at {config_path}") # this can be a logger.error
        return None
  1. Shall not return None from except. Same sample code from the previous point. For unrecoverable exception. raise instead of return None
  2. Do not handle existing exception unless you have some thing to add, open already throws FileNotFoundError. Same sample code from above.
  3. Use namedtuple (lightwight) for a long tuple with proper names for each field..
> Optional[Tuple[pd.DataFrame, pd.DataFrame, List[str], List[str], List[str], List[str], List[str]]]:
  1. Keep configuration in one place
    INFO is more precise than ERROR because you have the defaults.
    These settings shall be in load_config(). Program could stop early if there are must-to-have parameters
if not CONFIG:
    print("ERROR: Configuration not loaded. Using default classification settings.")
    temperature = 0.0
    max_tokens = 200
    output_wrapper_tag = "category"
  1. Below shall go to the load_config() as well.
else:
  # Assuming these might be added to config.yml under openai_settings or a new section
  openai_cfg = CONFIG.get('openai_settings', {})
  temperature = openai_cfg.get('classification_temperature', 0.0)
  max_tokens = openai_cfg.get('classification_max_tokens', 200)
  prompt_cfg = CONFIG.get('prompt_customization', {})
  output_wrapper_tag = prompt_cfg.get('output_wrapper_tag', "category")
  1. Adopt class to better organize the functions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions