Skip to content

Conversation

@jgold21
Copy link

@jgold21 jgold21 commented Dec 16, 2022

When using consistently large values for a single column, the array,push(...) will cause a Maximum call stack exceeded error. This also applies to Array.prototype.push.apply. This change also increases the speed of the reader.

jim-lake pushed a commit to jim-lake/parquetjs that referenced this pull request Oct 15, 2025
Note: The reference Brotli file doesn't work because it is too large. It
decompresses to an extremely large size (1Gb+)

Closes ironSource#140
Closes ironSource#125 (Likely?)

A new brotli sample file was generated using python code seen below.

## What Changed?

- Moved the esbuild work to mjs. This was required to get some of the
builds working correctly
- Fixed bug in the esbuild when it tried to build the browser code in
parallel with the test code causing a race condition
- Split the compression.ts file into a browser version and a node
version
- Swapped over to use `esbuild-plugin-wat` which worked better than the
copy-pasted one from esbuild
- Integrated the brotli-wasm correctly for browser, but used brotli
natively in nodejs

## Testing!

- There is a nodejs test for the node version
- There is a browser test for it as well:
    1. `npm i`
    2. `npm run build:browser`
    3. `npx serve .`
4. `open http://localhost:3000/test/browser/` in your preferred browser
- The example server also has it:
    - cd examples/server/
    - npm i
    - node app.js
    - cd ../../ && npm run serve
    - `open http://localhost:3000/` in your preferred browser

### Brotli Sample File generation script

```python
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq

# Create a small sample DataFrame
data = {
    'id': [1, 2, 3, 4, 5],
    'name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'age': [25, 30, 35, 40, 45]
}

df = pd.DataFrame(data)

# Convert DataFrame to PyArrow Table
table = pa.Table.from_pandas(df)

# Define output Parquet file path
output_file = "sample_brotli_compressed.parquet"

# Write to Parquet file with Brotli compression
pq.write_table(table, output_file, compression='BROTLI')

print(f"File {output_file} created successfully!")
``
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant