Skip to content

gnu-emacs-ru/telega-export

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

telega-export

Overview

telega-export is a small Emacs Lisp module that exports Telegram chat history via telega.el and TDLib.

  • Export formats: JSON and Org
  • Message order: descending (new → old) or ascending (old → new)
  • Optional filtering of service (system) messages
  • Robust, incremental, batched retrieval compatible with TDLib limits
  • Minimal dependencies, easy to extend

It provides a single interactive entry point:

  • M-x telega-export-chat

This command guides you through selecting a chat, choosing output format/order, whether to include service messages, and where to save.

Requirements

  • Emacs 27.1+
  • telega.el 0.8.000+
  • json.el 1.5+
  • TDLib via telega-server (provided by telega.el)
  • A logged-in, working telega session

Installation

  1. Place telega-export.el somewhere on your load-path, or into your project:
    • Example: ~/Code/telega-export/telega-export.el
  2. Add to your init:
    • (add-to-list 'load-path "~/Code/telega-export/")
    • (require 'telega-export)

Or with straight.el:

  • (straight-use-package ‘(telega-export :type ‘git :host nil :repo “file:///path/to/local/telega-export”))

Quick start

  1. Start telega and ensure the target chat is accessible in Emacs.
  2. M-x telega-export-chat
  3. Provide:
    • Chat ID (e.g., -1001234567890)
    • Format: json or org
    • Order: desc or asc
    • Whether to include service messages
    • Output file path

On completion, a message indicates how many messages were exported and where.

Output formats

JSON

  • A single JSON object containing:
    • chat: {id, title}
    • exported_at: ISO-8601 timestamp (local)
    • messages: array of normalized message entries

Each message has:

  • id: integer (TDLib message id)
  • date_unix: integer (seconds since epoch)
  • date_iso: string (local ISO-8601)
  • sender: string (best-effort display name)
  • content_type: string (e.g., “messageText”, “messagePhoto”, …)
  • text: string (message text or media caption; may be empty)
  • reply_to: integer (reply-to message id, or 0)

Note: Attachments are not downloaded; only text/captions are exported.

Org

  • File header with chat title and export timestamp
  • One heading per message:
    • Format: “**** 2024-10-05T18:54:12+0300 Sender Name (@username): message text”
    • Heading level is configurable via telega-export-org-heading-level
  • Newlines and asterisks in text are sanitized

Configuration (M-x customize-group RET telega-export)

  • telega-export-default-format: ‘json or ‘org (default: json)
  • telega-export-default-order: ‘desc or ‘asc (default: desc)
  • telega-export-include-service: boolean (default: nil)
  • telega-export-batch-size: integer (default: 200; clamped to TDLib max per request)
  • telega-export-org-heading-level: integer (default: 1)

Notes:

  • TDLib enforces limit ≤ 100 per request; the exporter clamps per-call limit accordingly.
  • Service messages are heuristically detected by TDLib content types.

Pagination algorithm and robustness

  • Uses TDLib getChatHistory in batches (newest → oldest per batch).
  • Global order:
    • desc: outputs newest batch first, then older batches; net result is newest → oldest overall.
    • asc: reverses each batch before writing; overall result is oldest → newest.
  • The exporter uses boundary pagination safely:
    • from_message_id is advanced to the oldest message boundary returned
    • An offset of -1 is used for subsequent calls to exclude the boundary (or, depending on local version, a clamped from_message_id is used to avoid invalid negatives)
    • from_message_id is never sent negative
  • Per-call limit is clamped to ≤ 100 to satisfy TDLib

This design avoids TDLib 400 “Invalid value of parameter from_message_id” errors that can occur when a negative or otherwise invalid boundary is passed.

Known limitations

  • Only text and captions are exported; media files are not downloaded or referenced
  • Sender name is a best-effort resolution (user full name and/or @username; or chat title for sender chats)
  • Service message detection is heuristic and might miss or over-include some types
  • Times are local timezone ISO-8601 strings; exact timezone is embedded (e.g., +0300)
  • No built-in resume for partially written JSON arrays; treat each run as a fresh export

Troubleshooting

  • TDLib 400 “Invalid value of parameter from_message_id”:
    • Ensure you are on recent telega.el and TDLib
    • The exporter clamps limits and uses boundary-safe pagination; if you patched locally, keep offset = -1 after the first batch or clamp the boundary to ≥ 0
  • Empty result:
    • Verify Chat ID is correct and accessible
    • Open the chat in telega buffer, then retry (ensures it’s loaded)
  • Performance:
    • Consider running during off-peak times; interactive Emacs may block while exporting
    • Large chats will take time; watch minibuffer progress messages

Extending

  • Adding a new format:
    • Implement writer open/write/close functions and add a branch in telega-export--make-writer
  • Enrich JSON schema:
    • Extend telega-export--msg->json-alist with extra TDLib fields
  • Advanced filtering:
    • Adjust service detection in telega-export--service-content-p or add custom predicates

Example

JSON messages entry (illustrative): [ { “id”: 1234567890123, “date_unix”: 1728155683, “date_iso”: “2024-10-05T18:54:43+0300”, “sender”: “Alice (@alice)”, “content_type”: “messageText”, “text”: “Hello, world!”, “reply_to”: 0 } ]

Org line:

2024-10-05T18:54:43+0300 Alice (@alice): Hello, world!

License GNU 2.1

About

Telega chat export

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published