Refactor Lifestream integration and improve code structure by aquarion · Pull Request #61 · aquarion/Lifestream

aquarion · 2026-03-04T11:50:26Z

This pull request modernizes the Lifestream project by introducing a new scheduler system, updating documentation, refactoring import scripts, and improving configuration and testing. The changes make the project easier to set up, maintain, and extend, while deprecating legacy approaches and clarifying best practices.

Scheduler & Job Management

Added a new APScheduler-based job scheduler with Redis persistence, replacing the legacy cron-based approach. Includes a systemd service file (docs/lifestream-scheduler.service) and updated documentation for job management and monitoring. [1] [2]
Updated config.example.ini with a [schedules] section for job configuration, notification settings, and removed legacy memcache config. [1] [2]

Documentation & Project Structure

Replaced the old README.textile with a comprehensive new README.md covering features, setup, configuration, running imports, project structure, and testing. [1] [2]

Import Script Refactoring

Refactored import scripts (imports/atom.py, imports/atproto_posts.py, imports/destiny2.py, contrib/oyster_import_csv.py) to use the new EntryStore class for database entry management, improved argument parsing, and clarified logging. [1] [2] [3] [4] [5] [6] [7]

Configuration & Testing

Updated .flake8 to ignore additional formatting rules and exclude new directories, improving code linting.
Added instructions and commands for running tests with pytest in the new documentation.

These changes collectively modernize the project, improve maintainability, and streamline both development and operations.

…ple modules

…mproved connection handling

…in Lifestream

…new helper methods

… functionality and backward compatibility

…ic into foursquare.py; remove obsolete foursquare_oauth.py

…t code_fetcher as CodeFetcher9000' across multiple modules

…le modules

…; update documentation and configuration files accordingly

…implement job execution functions for scheduler

…or caching in FoursquareAPI

…e logging and error handling across modules

…e features, quick start, configuration, and running imports sections

- Consolidated local imports in `steambadges.py`, `switchbot.py`, `tumblr.py`, `tweets.py`, `wordpress.py`, and `wow.py`. - Enhanced readability by adjusting line breaks and spacing in various files. - Updated exception handling to be more specific in `tumblr.py` and `wow.py`. - Added `flake8` as a dependency in `pyproject.toml` for code style enforcement. - Cleaned up unused imports and improved consistency in `scheduler.py`, `tests/conftest.py`, and other test files. - Ensured all changes adhere to PEP 8 style guidelines for better maintainability.

imports/historic.py

        print("Go to the following link in your browser:")
-        print("%s?oauth_token=%s" %
-              (authorize_url, request_token["oauth_token"]))
+        print("%s?oauth_token=%s" % (authorize_url, request_token["oauth_token"]))


Generally, to fix clear-text logging of sensitive data, ensure that secrets (tokens, passwords, API keys, etc.) are never written directly to logs or console output that may be captured. Either avoid logging them completely, or replace them with redacted or partially masked values.

For this specific code, we want to keep the existing interactive flow (the user must visit the authorization URL) but avoid constructing and printing a single string that embeds the token and could be treated as a logged secret. A simple, low-impact change is:

Print the base authorization URL (authorize_url) and the token separately, with clear instructions to the user.

Avoid combining them into one formatted URL string containing the token.

Do not log the token via logger; only show it in the interactive prompt.

Concretely, in imports/historic.py within tumblrAuth, replace the line:

print("%s?oauth_token=%s" % (authorize_url, request_token["oauth_token"]))

with safer messaging like:

print(authorize_url) print("Then provide this oauth_token when prompted in your browser: [REDACTED]") print("oauth_token (copy manually):", request_token["oauth_token"])

or similar, so the token is no longer part of a single URL string and is clearly separated. This keeps functionality (user still sees token) while reducing the pattern that static analysis flags and avoiding printing it as part of a URL that might be copy-pasted or logged wholesale. No new imports or helper methods are required.

imports/lifestream/db.py

imports/tumblr.py

        print("Go to the following link in your browser:")
-        print("%s?oauth_token=%s" %
-              (authorize_url, request_token["oauth_token"]))
+        print("%s?oauth_token=%s" % (authorize_url, request_token["oauth_token"]))


In general, to fix clear-text logging of sensitive information, remove or redact sensitive values before sending them to logging or printing functions. When guidance to the user is still needed, print only non-sensitive parts (e.g., the base URL) and/or instructions, but not the actual secret.

In this file, the primary issue is line 67, which prints the full authorization URL including the oauth_token parameter. We should change this so that the script no longer prints the token value. A simple and safe approach is:

Print only the authorize_url without query parameters.

Optionally instruct the user that if the browser doesn’t auto-add the token, they should re‑run or manually paste as needed (but we will not construct and print a tokenized URL).

Concretely:

In imports/tumblr.py, inside tumblrAuth, replace:

print("%s?oauth_token=%s" % (authorize_url, request_token["oauth_token"]))

with a print that does not include request_token["oauth_token"], e.g.:

print(authorize_url)

No extra imports or helper methods are required. No other lines need to be changed to address the flagged issue, and existing functionality is minimally impacted (the main behavioral change is that the user must manually obtain or follow the Tumblr-provided URL rather than copying the full URL from the console including a token).

tests/test_notifications.py

+
+                    mock_post.assert_called_once()
+                    call_args = mock_post.call_args
+                    assert "hooks.slack.com" in call_args[0][0]


In general, to avoid incomplete URL substring sanitization, always parse the URL and inspect its components (scheme, hostname, path) instead of checking for a bare substring. For host checks, use urllib.parse.urlparse to obtain hostname and compare that against an allowlist or an expected value, handling subdomains explicitly if needed.

In this specific test, we only need to ensure that the URL used for the Slack webhook call is pointing to Slack’s host. Instead of asserting "hooks.slack.com" in call_args[0][0], we can parse the URL and assert that its hostname is "hooks.slack.com". This maintains the same intent—verifying the request is made to Slack—while avoiding substring checks and aligning with the recommended pattern.

Concretely:

In tests/test_notifications.py, import urlparse from urllib.parse.

Replace line 116 with code that parses call_args[0][0] via urlparse and then asserts parsed.hostname == "hooks.slack.com".

No other behavior of the tests or the underlying notifications code needs to change.

the_fallen/openpaths.py

        accuracy = 100
-        logger.info("Found %s %s/%s" %
-                    (timestamp, latitude_best, longitude_best))
+        logger.info("Found %s %s/%s" % (timestamp, latitude_best, longitude_best))


In general, to fix clear‑text logging of sensitive information, either stop logging the sensitive fields entirely, or replace them with non‑sensitive/less‑sensitive data (for example, coarse‑grained or redacted values) while keeping the log’s usefulness. Also ensure that any remaining logging is appropriate for its log level (e.g., detailed data only at DEBUG and ideally still redacted).

For this specific case in the_fallen/openpaths.py, the sensitive fields are the high‑precision latitude and longitude (latitude_best, longitude_best) logged at line 96. The code already computes rounded (“vague”) coordinates (latitude_vague, longitude_vague). The least intrusive fix that preserves behavior is to change the log message to use the vague values instead of the precise ones, and to make the log text explicitly indicate that they’re approximate. No other behavior (DB writes, control flow) needs to change. This only requires editing the logger.info call around line 96; no new imports or helpers are needed.

Concretely: in the_fallen/openpaths.py, replace the logger.info("Found %s %s/%s" % (timestamp, latitude_best, longitude_best)) line with a version that logs latitude_vague and longitude_vague (e.g., "Found %s approx %s/%s"), leaving the rest of the loop unchanged.

Copilot

Pull request overview

This PR modernizes Lifestream by introducing an APScheduler-based scheduler (with Redis persistence), refactoring import scripts around a new shared library surface (EntryStore, cache/backoff helpers, notifications), and updating docs/config/tests to match the new operational model.

Changes:

Added a new scheduler.py daemon using APScheduler + Redis job store, plus systemd and cron migration docs.
Refactored many import scripts to use lifestream.db.EntryStore, new cache/backoff helpers, and updated argument parsing via lifestream.parse_args().
Added a pytest suite for the new cache/db/jobs/notifications modules and refreshed project docs/config examples.

Reviewed changes

Copilot reviewed 56 out of 64 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
vs.code-workspace	Minor workspace formatting fix.
the_fallen/tsw.py	Switches to `EntryStore` + updated arg parsing.
the_fallen/openpaths.py	Updates DB connection helpers and arg parsing.
the_fallen/moves.py	Refactors to `EntryStore`, updated imports/paths for secrets.
tests/test_notifications.py	Adds unit tests for email/Slack notification helpers.
tests/test_jobs.py	Adds unit tests for scheduler job execution helpers.
tests/test_db.py	Adds unit tests for DB connection helpers and `EntryStore`.
tests/test_cache.py	Adds unit tests for Redis backoff + file cache decorator.
tests/conftest.py	Adds shared pytest fixtures and sets up test environment.
scheduler.py	New APScheduler-based scheduler daemon and CLI.
pytest.ini	Updates pytest discovery paths and comment.
pyproject.toml	Adds APScheduler + flake8, removes legacy memcache dependency.
imports/wow.py	Refactors to `EntryStore`, updates secrets dir handling, exception narrowing TODOs.
imports/wordpress.py	Refactors to `EntryStore` and new `lifestream.parse_args()`.
imports/tweets.py	Refactors to `EntryStore` and secrets dir helper.
imports/tumblr.py	Refactors to `EntryStore`, improved exception handling TODOs.
imports/switchbot.py	Refactors stats writes to `EntryStore`, improves typing/default args.
imports/steambadges.py	Refactors to `EntryStore`, minor robustness for missing image src.
imports/steam.py	Extracts Steam API client, refactors to `EntryStore`.
imports/planetside.py	Refactors to `EntryStore` + Redis backoff helper.
imports/oyster_csv.py	Refactors to `EntryStore` + updated arg parsing.
imports/oyster.py	Refactors to `EntryStore`, regex literal cleanup, updated arg parsing.
imports/mastodon_toots.py	Refactors to `EntryStore` + updated arg parsing.
imports/mailbox-stats.py	Refactors stats writes to `EntryStore`, removes old helper usage.
imports/lifestreamutils.py	Removes legacy DB stats helper (replaced by `EntryStore.add_stat`).
imports/lifestream/utils.py	Introduces utility module (time delta formatting, JSON coercion, etc.).
imports/lifestream/steam_api.py	Introduces reusable Steam Web API client.
imports/lifestream/notifications.py	Introduces email/Slack failure notifications for jobs.
imports/lifestream/jobs.py	Introduces `run_import` / `run_shell_command` job runners.
imports/lifestream/foursquare_api.py	Introduces Foursquare client with Redis caching (new module).
imports/lifestream/db.py	Introduces DB helpers and `EntryStore` abstraction.
imports/lifestream/code_fetcher.py	Updates CodeFetcher to use package config/project root and path-safe template loading.
imports/lifestream/cache.py	Introduces Redis helpers + file cache decorator.
imports/lifestream/init.py	Major refactor: config/logging/arg parsing, re-exports, removes legacy classes.
imports/lastfm.py	Converts to `main()` entrypoint and `EntryStore`.
imports/instagram.py	Refactors to `EntryStore` + updated arg parsing.
imports/historic.py	Refactors to `EntryStore` and DB helper functions.
imports/gw2.py	Refactors to `EntryStore`, Redis backoff + file cache decorator.
imports/github_commits.py	Refactors to `EntryStore` + updated arg parsing.
imports/foursquare_oauth.py	Removes legacy OAuth helper script (replaced by `imports/foursquare.py` auth flow).
imports/foursquare.py	Adds integrated OAuth flow + refactors to `EntryStore` and DB helpers.
imports/flickr.py	Converts to `main()` entrypoint, refactors to `EntryStore` and DB helpers.
imports/fitbit_day.py	Refactors to `EntryStore` for entries/stats, removes old stats helper usage.
imports/ffxiv.py	Refactors to `EntryStore` and updated arg parsing.
imports/facebook_posts.py	Refactors to `EntryStore`, Redis backoff, updated secrets dir.
imports/facebook_page.py	Refactors to `EntryStore`, updated secrets dir, small ordering cleanup.
imports/destiny2.py	Refactors to `EntryStore`, Redis backoff helpers, updated secrets dir.
imports/atproto_posts.py	Refactors to `EntryStore` + updated arg parsing.
imports/atom.py	Converts to `main()` entrypoint and refactors to `EntryStore`.
docs/lifestream-scheduler.service	Adds systemd unit for running the scheduler daemon.
docs/crontab.example	Adds deprecated legacy crontab reference and migration notes.
datafiles/achievements_schema.sql	Minor formatting/whitespace fix.
contrib/oyster_import_csv.py	Updates DB connection usage to new helper functions.
contrib/ffxiv-upload-achievement-icons/update_sc_and_update_icons.ps1	Minor formatting/whitespace fix.
contrib/ffxiv-upload-achievement-icons/readme.md	Minor formatting/whitespace fix.
contrib/ffxiv-upload-achievement-icons/lib/powershell/read_ini.ps1	Formatting cleanup.
contrib/ffxiv-upload-achievement-icons/etc/achievements_schema.sql	Minor formatting/whitespace fix.
contrib/ffxiv-upload-achievement-icons/.gitignore	Minor formatting/whitespace fix.
config.example.ini	Adds scheduler schedules + notifications config, removes legacy memcache config, updates paths.
README.textile	Removes legacy README.
README.md	Adds modern README with setup, scheduler usage, configuration and testing instructions.
.gitignore	Removes trailing blank line.
.flake8	Expands ignores/excludes to reflect refactor and new directories.
.envrc	Removes trailing blank line.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-04T12:01:16Z

scheduler.py

+    for job_name, config in schedules.items():
+        cron = config["cron"]
+
+        try:
+            trigger = CronTrigger.from_crontab(cron)
+        except ValueError as e:
+            logger.error(f"Invalid cron expression for {job_name}: {cron} - {e}")
+            continue
+
+        # Check if it's a shell command (starts with !)
+        if job_name.startswith("!"):
+            actual_name = job_name[1:]
+            command = cron  # For shell commands, cron is actually the command
+            scheduler.add_job(
+                run_shell_command,
+                trigger=trigger,
+                args=[actual_name, command],
+                id=actual_name,
+                name=actual_name,
+                misfire_grace_time=config["misfire_grace_time"],
+                coalesce=config["coalesce"],
+                replace_existing=True,
+            )


Shell-command jobs (those with names starting with !) can't work as implemented: cron is used to build CronTrigger.from_crontab(cron) before the job_name.startswith('!') check, but for shell jobs the code then treats that same cron string as the shell command. A non-crontab shell command will fail trigger parsing, and a real cron expression can't also be the command. Consider changing the config format for shell jobs (e.g., separate schedule/command fields) and only pass the cron part into CronTrigger.from_crontab().

imports/lifestream/notifications.py

tests/conftest.py

imports/lifestream/__init__.py

imports/lifestream/db.py

Copilot · 2026-03-04T12:01:18Z

imports/gw2.py

+entry_store = EntryStore()

-APIKEY = Lifestream.config.get("guildwars2", "apikey")
+APIKEY = entry_store.config.get("guildwars2", "apikey")

 logger = logging.getLogger("GW2")
-args = lifestream.arguments.parse_args()
+args = lifestream.parse_args()



Several scripts call entry_store.config.get(...), but EntryStore currently doesn't expose a config attribute. This will raise AttributeError at import-time for this module. Either switch these reads back to lifestream.config, or add a config attribute/property on EntryStore that references the shared lifestream config.

imports/lifestream/foursquare_api.py

…ror notification formatting

aquarion added 19 commits March 2, 2026 19:28

Refactor SteamAPI integration and streamline method calls in steam.py

00f5b6c

Refactor argument parsing to use lifestream.parse_args() across multi…

a5efc37

…ple modules

Refactor secrets and log directory retrieval to use new helper methods

ea5b687

Add no-db mode support to Lifestream class for non-database operations

055eb34

Refactor database functionality and streamline Lifestream class for i…

9eadee1

…mproved connection handling

Refactor Redis connection handling and deprecate old caching methods …

4d25228

…in Lifestream

Refactor database connection handling across multiple modules to use …

dc57878

…new helper methods

Refactor entry handling to use EntryStore across multiple import modules

a18c152

Refactor utility functions and add Foursquare API client for improved…

b365031

… functionality and backward compatibility

Refactor Foursquare OAuth handling and consolidate authentication log…

3e00160

…ic into foursquare.py; remove obsolete foursquare_oauth.py

Refactor import statements to consistently use 'from lifestream impor…

7b15030

…t code_fetcher as CodeFetcher9000' across multiple modules

Refactor code for improved readability and type hinting across multip…

b5c959a

…le modules

Add crontab example for Lifestream scheduled tasks

6640f10

Implement APScheduler for job scheduling and replace crontab approach…

92cff07

…; update documentation and configuration files accordingly

Add notification functionality for job failures via email and Slack; …

840c2c1

…implement job execution functions for scheduler

Remove memcache support and related configurations; switch to Redis f…

7bbece2

…or caching in FoursquareAPI

Refactor importers and add main function for better structure; enhanc…

6c27e1e

…e logging and error handling across modules

Refactor README documentation to improve clarity and structure; updat…

946fb8d

…e features, quick start, configuration, and running imports sections

aquarion requested a review from Copilot March 4, 2026 11:50

Copilot started reviewing on behalf of aquarion March 4, 2026 11:51 View session

github-advanced-security bot found potential problems Mar 4, 2026

View reviewed changes

Copilot AI reviewed Mar 4, 2026

View reviewed changes

Refactor path resolution and database connection handling; improve er…

29646d2

…ror notification formatting

@@ -51,7 +51,9 @@
                     request_token = dict(urllib.parse.parse_qsl(content))
                     print("Go to the following link in your browser:")
-                    print("%s?oauth_token=%s" % (authorize_url, request_token["oauth_token"]))
+                    print(authorize_url)
+                    print("When prompted in your browser, enter the following oauth_token manually:")
+                    print("oauth_token:", request_token["oauth_token"])
                     print()
                     accepted = "n"

@@ -1,6 +1,7 @@
             """Tests for lifestream.notifications module."""
             from unittest.mock import MagicMock, patch
+            from urllib.parse import urlparse
             from lifestream import notifications
@@ -113,7 +114,8 @@
                                 mock_post.assert_called_once()
                                 call_args = mock_post.call_args
-                                assert "hooks.slack.com" in call_args[0][0]
+                                parsed_url = urlparse(call_args[0][0])
+                                assert parsed_url.hostname == "hooks.slack.com"
             class TestSendFailureNotifications:

@@ -93,7 +93,7 @@
                 ):
                     accuracy = 100
-                    logger.info("Found %s %s/%s" % (timestamp, latitude_best, longitude_best))
+                    logger.info("Found %s approx %s/%s" % (timestamp, latitude_vague, longitude_vague))
                     #                                (`id`,                 `lat`,             `long`,            `lat_vague`,   `long_vague`, `timestamp`, `accuracy`)
                     cursor.execute(
                         s_sql,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Lifestream integration and improve code structure#61

Refactor Lifestream integration and improve code structure#61
aquarion wants to merge 20 commits intomainfrom
feature/refactor_steam

aquarion commented Mar 4, 2026

Uh oh!

Check failure

Copilot Autofix

Uh oh!

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Copilot AI left a comment

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aquarion commented Mar 4, 2026

Uh oh!

Check failure

Uh oh!

Copilot Autofix

Uh oh!

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Uh oh!

Copilot Autofix

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants