feat: add xlsx and ods general fallback support#241
feat: add xlsx and ods general fallback support#241t1anchen wants to merge 1 commit intophiresky:masterfrom
Conversation
@t1anchen I opened #247 as a proposal for a long-term solution to this problem of the built-in adapters requiring code changes for any new extension or mimetype. Your review would be appreciated if you have the time! I'm new to Rust, so any feedback would be helpful regardless of how large or small. |
|
The reason your initial solution does not work is that when you run unzip you don't filter out binary files - so you would need a script that probably unzips a file to stdout while skipping all binary files. That's what the integrated zip adapter does. I agree with @lafrenierejm that the solution to this is probably to make the extensions, especially zip, user-configurable |
You're a master to build things I believe and I'm just no-one user :) |
|
Considering a PR(#247) is created for permanant solution, this PR can be closed. |
Background
I attempted to use Custom Adapter suggested in #36 (comment) and configure like this
{ "name": "spreadsheet", "version": 1, "description": "Spreadsheet like ods is supported in the search", "extensions": ["ods"], "mimetypes": ["application/vnd.oasis.opendocument.spreadsheet"], "binary": "unzip", // Placeholders: // // - $input_file_extension: the file extension (without dot). e.g. // foo.tar.gz -> gz // - $input_file_stem, the file name without the last extension. e.g. // foo.tar.gz -> foo.tar // - $input_virtual_path: the full input file path. Note that this path // may not actually exist on disk because it is the result of another // adapter stdin of the program will be connected to the input file, and // stdout is assumed to be the converted file "args": ["-U", "-aa", "-p", "$input_virtual_path", "content.xml"], "disabled_by_default": false, "match_only_by_mime": false }However, it did not show the match and always show unknown error like
Even I just put command line (
binaryandargsin the config) to terminal and run successfully without no issue.From reading code, just found this line
ripgrep-all/src/adapters/zip.rs
Line 10 in e207c12
odsandxlsxis zip archived and can be searched from extraction.Purpose
Just provide a fallback general availability to search zipped document that not supported by pandoc or other tools etc. If custom adapter is configured, it will use custom adapter rather than this config.
.epubcan be considered as a zipped package if following this idea.Further
Basically it's just adhoc tweak, but hopefully provide a temporary mitigation for the pains from users. If we have any better perm/adboc solution, I'm glad to help on that.