Skip to content

Can't get ingest_code.py to run #9

@sbowman

Description

@sbowman

If I try to build this using Python 3.12, it doesn't work:

Collecting gevent<23.0.0,>=22.10.2 (from chainlit->-r requirements.txt (line 4))
  Using cached gevent-22.10.2.tar.gz (6.6 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
...

I can get 3.12 to work by setting a version for chainlit in requirements.txt:

chainlit > 1

Alternatively I can use Python 3.9 (or 3.10) and everything will install as is (no modifications to requirements.txt).

However, in either case, I get an error like this when trying to run ingest-code.py:

Traceback (most recent call last):
  File "/Users/sbowman/projects/rag_time/ingest-code.py", line 84, in <module>
    create_vector_database()
  File "/Users/sbowman/projects/rag_time/ingest-code.py", line 36, in create_vector_database
    chunked_documents.extend(chunk_code())
  File "/Users/sbowman/projects/rag_time/ingest-code.py", line 61, in chunk_code
    loaded_documents = loader.load()
  File "/Users/sbowman/projects/rag_time/.venv-rag-time/lib/python3.9/site-packages/langchain_core/document_loaders/base.py", line 31, in load
    return list(self.lazy_load())
  File "/Users/sbowman/projects/rag_time/.venv-rag-time/lib/python3.9/site-packages/langchain_community/document_loaders/generic.py", line 116, in lazy_load
    yield from self.blob_parser.lazy_parse(blob)
  File "/Users/sbowman/projects/rag_time/.venv-rag-time/lib/python3.9/site-packages/langchain_community/document_loaders/parsers/language/language_parser.py", line 195, in lazy_parse
    code = blob.as_string()
  File "/Users/sbowman/projects/rag_time/.venv-rag-time/lib/python3.9/site-packages/langchain_core/documents/base.py", line 153, in as_string
    return f.read()
  File "/Users/sbowman/.pyenv/versions/3.9.21/lib/python3.9/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions