-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Hello,
I have no programming background, but I greatly enjoyed using the Chat Export tool.
I encountered some issues when running the exported HTML file with media attachments. I exported a conversation with over 17,000 messages, but starting around message #3000, the audio and video media stopped playing—the media players were inactive.
Another issue I noticed was with localization: message dates were displayed in English (e.g., 'Thu') instead of Brazilian Portuguese, which is my language.
I attempted to implement new updates to the code using Artificial Intelligence, but as I lack programming knowledge, I failed and couldn't complete the task.
Therefore, I would like to offer a few suggestions for improvements. I hope this doesn't come across as being demanding (or "a pain").
Thank you in advance for your dedication to the code." My WhatsApp: 5599981338301
🐛 Identified Issues
- Media Playback Failure: Audio and video media fail to play (become inactive/non-functional) in the generated HTML file starting around message #3000 in very large conversations (e.g., 17,000+ messages).
- Localization Inconsistencies: Date information (e.g., "Thu" instead of "Qui") and system messages (e.g., call information like "[call (attempt)]") are displayed in English instead of the user's expected language (Brazilian Portuguese, in this case).
💡 Improvement Suggestions
1. File and Directory Management (Output Structure)
Objective: Standardize the output directory for better organization and predictability.
- Output Directory Location: The generated output (HTML file, media, CSS, and attachments) must be saved into a new directory created in the same parent folder where the source ZIP file is located.
- Example: If
chat.zipis inC:/Users/User/Documents/, the output directory should be created inC:/Users/User/Documents/.
- Example: If
- Directory Naming and Conflict Resolution:
- The output directory should be named after the conversation.
- If a directory with the exact same name already exists, the existing directory should be preserved, and a new, numbered directory must be created automatically.
- Naming Convention for Duplicates: Append a counter:
[DirectoryName] (2),[DirectoryName] (3), etc.
- Clarity: Each exported conversation should maintain all its related files (HTML, media, assets) within its single, dedicated output directory.
2. Detailed Message Statistics Reporting
Objective: Enhance the log/console output with a detailed breakdown of message types.
The script, in the step where it displays information about the ZIP file (e.g., "ZIP file is an Android export with media/attachments..."), must include the following detailed message counts:
- Participant Message Counts: The number of messages sent by each participant.
- System Messages: The count of non-participant messages (e.g., "Messages and calls are end-to-end encrypted...", unparsed system notifications).
- Invalid/Unparseable Messages: The count of messages that cannot be definitively classified as system or participant messages. Display
0if none are found.
The overall output structure in the console log should be updated to follow this format (in English):
Messages in this chat: [total_count]
[Participant 1 Name]: [count_1]
[Participant 2 Name]: [count_2]
_______________________________________
System messages: [system_count]
Invalid messages: [invalid_count]
Formatting Note: All message counts displayed in the console output (like the total count and the count in the user prompt for splitting, see point 4) should use a dot (.) as the thousands separator (e.g.,
3. Localization and Internationalization (i18n)
Objective: Ensure all dynamic text in the HTML output respects the conversation's detected language.
-
Automatic Language Detection: Use the
langdetectlibrary to automatically identify the predominant language of the chat content (.txtfile). Do not prompt the user for the language or display the detected language. -
Date/Number Formatting: Use the
Babellibrary to adjust the date and number formatting based on the detected language (e.g.,dd/mm/yyyyfor Portuguese,mm/dd/yyyyfor English). -
Translation of Fixed/System Strings:
- The date string beneath each message must be translated (e.g., "Thu"
$\rightarrow$ "Qui"). - Call/system information must be translated (e.g.,
[call (attempt)]$\rightarrow$ chamada (tentativa)). -
Crucial Formatting Rule: Remove the square brackets (
[]) from the translated system messages like call logs, while keeping the parentheses()(e.g.,[call (attempt)]$\rightarrow$ chamada (tentativa)).
- The date string beneath each message must be translated (e.g., "Thu"
4. Splitting Large Conversations (Chunking)
Objective: Resolve the media playback issue by splitting the exported HTML into smaller, manageable files.
This is a mandatory feature to address the media failure in long chats.
-
User Prompt: After the participant selection step, the script must ask the user whether they want to split the conversation.
- Prompt Text (in English):
Split chat? [Y/n]: Check this option if your conversation contains more than 3.000 messages. In very long conversations, audio and video media may not work. This chat contains [total_count] messages(Note:
[total_count]must use the dot (.) as the thousands separator, e.g., $16.539$) -
Splitting Logic:
- If the user chooses 'Yes' (
Y), the chat must be split into multiple HTML files, with each part containing a maximum of 3,000 messages. - The splitting process must happen during the initial export/writing phase (read
.txt, process, and write to divided HTML files directly), not by attempting to split a single large, already-written HTML file.
- If the user chooses 'Yes' (
-
HTML File Naming Convention (for Split Files):
Chat Type Naming Convention Individual Conversa_com_[Name]_Part1.html,Conversa_com_[Name]_Part2.html, etc.Group Conversa_grupo_[GroupName]_Part1.html,Conversa_grupo_[GroupName]_Part2.html, etc. -
Navigation Links:
- In the generated HTML for each part, a simple, text-based, centered link must be added at the bottom to navigate to the subsequent part.
- Link Text (Translated): "Clique aqui para ir para a próxima parte da conversa" (or its equivalent in the detected language).
- Link Behavior: Clicking the link should automatically open the next HTML part in a new browser tab.
- Final Part: The link should not appear in the last HTML file.
-
Standard Export Link: The original functionality that exports a separate, media-linked HTML file (
chat_media_linked.html) is not necessary when the chat is split, as the splitting function inherently solves the media issue.
📝 Final Output Flow Example (Script Console)
The final console script execution flow should follow this order and include the requested new information:
Welcome to chat-export v1.0.3
----------------------------------------
Select the WhatsApp chat export ZIP file you want to convert to HTML.
Processing selected file: C:/Projetos/Conversa WhatsApp Neuziany.zip...
ZIP file is an Android export with media/attachments, 'Conversa do WhatsApp com Neuziany das Cunhãs.txt' is the chat text file.
Messages in this chat: 16.539
Neuziany das Cunhãs: 8.244
Arlin: 8.295
_____________________________
System messages: 3
Invalid messages: 0
Optional: Enter date range to filter messages
Supported formats: MM/DD/YYYY, DD.MM.YYYY, MM/DD/YY, DD.MM.YY
Leave empty to skip
From date (optional):
Until date (optional):
Found the following participants in the chat:
1. Neuziany das Cunhãs
2. arlin
Enter the number corresponding to your name: 2
Split chat? [Y/n]:
Check this option if your conversation contains more than 3.000 messages. In very long conversations, audio and video media may not work. This chat contains 16.539 messages
y
Exporting 16.539 messages.
Writing HTML files...
Extracting attachments/media...
Processing took 20.887 seconds
Written: C:\Projetos\chat-export-main\Conversa_com_Neuziany_Parte1.html, C:\Projetos\chat-export-main\Conversa_com_Neuziany_Parte2.html, ...
Done.
Would you like to open them in the browser? [Y/n]: n
Do you like the tool and want to buy me a coffee? [y/N]: y