Skip to content

GoogleVisionFormatter return OpenPechaFS's pecha object with incorrect is_private value #239

@ta4tsering

Description

@ta4tsering

Describe the bug
When I use the recently updated GoogleVisionFormatter class to create opf from OCR output, Even when the work_id's CopyRight status is Public domain, the is_private key's value in the pecha object of the OpenpechaFS return is True and the published opf is private when it should be public.

To Reproduce
Steps to reproduce the behavior:

  1. use the below script
     from openpecha.formatters.ocr.google_vision import GoogleVisionFormatter, GoogleVisionBDRCFileProvider
     from openpecha.core.pecha import OpenPechaGitRepo
     from openpecha.core.ids import get_initial_pecha_id
    
     def make_opf(ocr_import_info, ocr_path):
      work_id = "W3CN18530"
      data_provider = GoogleVisionBDRCFileProvider(bdrc_scan_id=work_id, ocr_import_info=ocr_import_info, 
     ocr_disk_path=ocr_path)
      pecha_id = get_initial_pecha_id()
      formatter = GoogleVisionFormatter(f"./pechas/{pecha_id}/{pecha_id}.opf")
      pecha = formatter.create_opf(data_provider, pecha_id, {}, ocr_import_info)
      pecha.__class__ = OpenPechaGitRepo
      pecha.storage = None
      pecha.meta.id = pecha.pecha_id
      pecha.save_meta()
      pecha.publish(asset_path=ocr_path, asset_name="ocr_output")
    
    if __name__ == "__main__":
      ocr_import_info = {
    	"source": "bdrc",
    	"software": "vision",
    	"batch": "batch-G8E3G",
    	"expected_default_language": "bo",
    	"bdrc_scan_id": "W3CN18530",
    	"ocr_info": {
    		"timestamp": "2023-01-20T17:42:00",
    		"imagesfolder": "images"
    	  }
        }
      ocr_path = Path(f"./ocrs/W3CN18530")
      pecha = make_opf(ocr_import_info, ocr_path)```
    
    
  2. Below link is the OCR example of the OCR output used.
    OCR output of W3CN18530

Expected behavior
the return of OpenpechaFS pecha object's is_private should be false

Screenshots
below screenshot image is what the GoogleVisionFormatter returns
Screenshot 2023-01-24 at 9 49 43 AM

Desktop (please complete the following information):
Openpecha toolkit

  • Version 0.9.23

Additional context
None

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions