Skip to content

[FEATURE] all uncompressed file sizes for different scale factors #67

@MrPowers

Description

@MrPowers

I computed the file sizes for scale factor 1 as follows:

import pandas as pd

def get_mem(df):
    # Get memory usage in bytes
    memory_usage = df.memory_usage(deep=True).sum()
    
    print(f"Memory size: {memory_usage:,} bytes")
    print(f"Memory size: {memory_usage / (1024**2):.2f} MB")

df = pd.read_parquet('building.parquet')
get_mem(df)

Here are the file sizes for SF1:

Table Size (sf1)
building 4.06 MB
customer 9.30 MB
driver 0.15 MB
trip 3547.67 MB
vehicle 0.02 MB
zone 1388.61 MB

Perhaps we can expose a helper to fetch the uncompressed data size and add this info to the documentation?

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions