Skip to content

[ROMM-2899] PKGJ feed endpoints#2986

Draft
gantoine wants to merge 4 commits intomasterfrom
romm-2899
Draft

[ROMM-2899] PKGJ feed endpoints#2986
gantoine wants to merge 4 commits intomasterfrom
romm-2899

Conversation

@gantoine
Copy link
Member

@gantoine gantoine commented Feb 4, 2026

Description
Explain the changes or enhancements you are proposing with this pull request.

Checklist
Please check all that apply.

  • I've tested the changes locally
  • I've updated relevant comments
  • I've assigned reviewers for this PR
  • I've added unit tests that cover the changes

Screenshots (if applicable)

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @gantoine, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request expands the API's feed generation capabilities by adding support for PKGJ feeds across multiple PlayStation platforms (PSP, PSVita, and PSX). It streamlines the process of creating these text-based feeds by introducing a reusable response helper and new data schemas, ensuring that users can access game and DLC information in a format compatible with PKGJ clients.

Highlights

  • New PKGJ Feed Endpoints: Introduced several new API endpoints to generate PKGJ-compatible feed files for PlayStation Portable (PSP), PlayStation Vita (PSVita), and PlayStation (PSX) games and DLCs.
  • Code Refactoring for Text Responses: A new helper function, _text_response, was created to centralize and standardize the generation of plain text file responses, improving code reusability and maintainability across various feed endpoints.
  • New Pydantic Schemas: Defined new Pydantic schemas (PkgjBaseItemSchema, PkgjPSPGamesItemSchema, PkgjPSPDlcsItemSchema, PkgjPSVGamesItemSchema, PkgjPSVDlcsItemSchema, PkgjPSXGamesItemSchema) to accurately model the data structure for the new PKGJ feed items.
  • Comprehensive Unit Tests: Added dedicated unit tests for each new PKGJ feed endpoint to ensure their correct functionality and adherence to the expected output format.
Changelog
  • backend/endpoints/feeds.py
    • Imported new Pkgj feed item schemas for PSP, PSVita, and PSX.
    • Created a _text_response helper function to encapsulate text file response generation.
    • Refactored existing pkgi_ps3_feed, pkgi_psvita_feed, and pkgi_psp_feed functions to utilize the new _text_response helper.
    • Added new API endpoints for /pkgj/psp/games, /pkgj/psp/dlc, /pkgj/psvita/games, /pkgj/psvita/dlc, and /pkgj/psx/games.
    • Implemented a _format_pkgj_datetime helper for consistent date formatting in Pkgj feeds.
  • backend/endpoints/responses/feeds.py
    • Imported the datetime module.
    • Defined PkgjBaseItemSchema as a base for Pkgj feed items.
    • Created specific Pydantic schemas: PkgjPSPGamesItemSchema, PkgjPSPDlcsItemSchema, PkgjPSVGamesItemSchema, PkgjPSVDlcsItemSchema, and PkgjPSXGamesItemSchema.
  • backend/tests/endpoints/feeds.py
    • Added test_pkgj_psp_games_feed to verify the PSP games feed endpoint.
    • Added test_pkgj_psp_dlc_feed to verify the PSP DLC feed endpoint.
    • Added test_pkgj_psvita_games_feed to verify the PSVita games feed endpoint.
    • Added test_pkgj_psvita_dlc_feed to verify the PSVita DLC feed endpoint.
    • Added test_pkgj_psx_games_feed to verify the PSX games feed endpoint.
Activity
  • The pull request was opened by gantoine with the title '[ROMM-2899] PKGJ feed endpoints'.
  • The description mentions the purpose is to explain changes or enhancements, but currently contains only a checklist with unchecked items and a placeholder for screenshots.
  • No comments or reviews have been made on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces new PKGJ feed endpoints and refactors existing feed generation logic. However, a critical vulnerability exists across all new and refactored endpoints: they are susceptible to CSV/TSV injection due to unsanitized delimiters and newline characters in the generated responses. This could allow an attacker to hijack downloads by manipulating column data. Additionally, the new PKGJ endpoints for games and DLCs suffer from significant code duplication and a bug where content is not filtered by type, leading to incorrect feed content. The current tests are insufficient to catch these filtering issues.

Comment on lines +672 to +688
txt_lines.append(
"\t".join(
[
pkgj_item.title_id,
pkgj_item.region,
pkgj_item.type,
pkgj_item.name,
pkgj_item.download_link,
pkgj_item.content_id,
last_modified,
pkgj_item.rap,
pkgj_item.download_rap_file,
str(pkgj_item.file_size),
pkgj_item.sha_256,
]
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The Pkgj feeds are vulnerable to TSV injection. An attacker can inject a tab character into the ROM name to shift the columns, potentially hijacking the "PKG direct link" column and pointing it to a malicious URL. All fields joined with tabs should have tab and newline characters removed or replaced. This specific issue highlights the need for robust sanitization within the pkgj_*_feed functions, which currently suffer from significant code duplication, making consistent application of such fixes challenging.

Suggested change
txt_lines.append(
"\t".join(
[
pkgj_item.title_id,
pkgj_item.region,
pkgj_item.type,
pkgj_item.name,
pkgj_item.download_link,
pkgj_item.content_id,
last_modified,
pkgj_item.rap,
pkgj_item.download_rap_file,
str(pkgj_item.file_size),
pkgj_item.sha_256,
]
)
)
txt_lines.append(
"\t".join(
[
pkgj_item.title_id,
pkgj_item.region,
pkgj_item.type,
pkgj_item.name.replace("\t", " ").replace("\n", " ").replace("\r", " "),
pkgj_item.download_link,
pkgj_item.content_id,
last_modified,
pkgj_item.rap,
pkgj_item.download_rap_file,
str(pkgj_item.file_size),
pkgj_item.sha_256,
]
)
)

Comment on lines +337 to +350
txt_lines.append(
",".join(
[
pkgi_item.contentid,
str(pkgi_item.type),
f'"{pkgi_item.name}"',
pkgi_item.description,
pkgi_item.rap,
f'"{pkgi_item.url}"',
str(pkgi_item.size),
pkgi_item.checksum,
]
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The feed endpoints construct text-based responses (CSV or TSV) by joining fields with commas or tabs without sanitizing or escaping these delimiters or newline characters within the fields (e.g., ROM name). An attacker with permissions to edit ROM metadata or upload files with crafted names can inject arbitrary columns or rows into the feeds. For Pkgj feeds, this can be used to hijack downloads by shifting the "PKG direct link" column to a malicious URL.

To remediate this, sanitize or escape delimiters (tabs, commas) and newline characters in all fields before joining them. For TSV formats, replace tabs and newlines with spaces. For CSV formats, ensure proper escaping of quotes and delimiters.

Suggested change
txt_lines.append(
",".join(
[
pkgi_item.contentid,
str(pkgi_item.type),
f'"{pkgi_item.name}"',
pkgi_item.description,
pkgi_item.rap,
f'"{pkgi_item.url}"',
str(pkgi_item.size),
pkgi_item.checksum,
]
)
)
txt_lines.append(
",".join(
[
pkgi_item.contentid,
str(pkgi_item.type),
f'"{pkgi_item.name.replace("\"", "\"\"")}"',
pkgi_item.description.replace("\"", "\"\""),
pkgi_item.rap.replace("\"", "\"\""),
f'"{pkgi_item.url}"',
str(pkgi_item.size),
pkgi_item.checksum,
]
)
)

Comment on lines +410 to +423
txt_lines.append(
",".join(
[
pkgi_item.contentid,
str(pkgi_item.flags),
f'"{pkgi_item.name}"',
pkgi_item.name2,
pkgi_item.zrif,
f'"{pkgi_item.url}"',
str(pkgi_item.size),
pkgi_item.checksum,
]
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Similar to the PKGi PS3 feed, the PS Vita feed is vulnerable to CSV injection if the ROM name or other fields contain double quotes or newlines. Delimiters and newlines should be properly escaped or replaced.

Suggested change
txt_lines.append(
",".join(
[
pkgi_item.contentid,
str(pkgi_item.flags),
f'"{pkgi_item.name}"',
pkgi_item.name2,
pkgi_item.zrif,
f'"{pkgi_item.url}"',
str(pkgi_item.size),
pkgi_item.checksum,
]
)
)
txt_lines.append(
",".join(
[
pkgi_item.contentid,
str(pkgi_item.flags),
f'"{pkgi_item.name.replace("\"", "\"\"")}"',
pkgi_item.name2.replace("\"", "\"\""),
pkgi_item.zrif.replace("\"", "\"\""),
f'"{pkgi_item.url}"',
str(pkgi_item.size),
pkgi_item.checksum,
]
)
)

Comment on lines +611 to +625
txt_lines.append(
"\t".join(
[
kekatsu_item.title,
kekatsu_item.platform,
kekatsu_item.region,
kekatsu_item.version,
kekatsu_item.author,
kekatsu_item.download_url,
kekatsu_item.filename,
str(kekatsu_item.size),
kekatsu_item.box_art_url,
]
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The Kekatsu feed uses a tab-separated format. If fields like the ROM title or filename contain tabs or newlines, it will break the TSV structure and potentially allow for column or row injection. Tabs and newlines should be replaced with spaces or escaped.

Suggested change
txt_lines.append(
"\t".join(
[
kekatsu_item.title,
kekatsu_item.platform,
kekatsu_item.region,
kekatsu_item.version,
kekatsu_item.author,
kekatsu_item.download_url,
kekatsu_item.filename,
str(kekatsu_item.size),
kekatsu_item.box_art_url,
]
)
)
txt_lines.append(
"\t".join(
[
kekatsu_item.title.replace("\t", " ").replace("\n", " ").replace("\r", " "),
kekatsu_item.platform,
kekatsu_item.region,
kekatsu_item.version,
kekatsu_item.author,
kekatsu_item.download_url,
kekatsu_item.filename.replace("\t", " ").replace("\n", " ").replace("\r", " "),
str(kekatsu_item.size),
kekatsu_item.box_art_url,
]
)
)

Comment on lines +698 to +745
def pkgj_psp_dlcs_feed(request: Request) -> Response:
platform = db_platform_handler.get_platform_by_slug(UPS.PSP)
if not platform:
raise HTTPException(
status_code=404, detail="PlayStation Portable platform not found"
)

roms = db_rom_handler.get_roms_scalar(platform_ids=[platform.id])
txt_lines = []
txt_lines.append(
"Title ID\tRegion\tName\tPKG direct link\tContent ID\tLast Modification Date\tRAP\tDownload .RAP file\tFile Size\tSHA256"
)

for rom in roms:
download_url = generate_rom_download_url(request, rom)
last_modified = _format_pkgj_datetime(rom.updated_at)

pkgj_item = PkgjPSPDlcsItemSchema(
title_id="",
region=rom.regions[0] if rom.regions else "",
name=(rom.name or rom.fs_name_no_tags).strip(),
download_link=download_url,
content_id="",
last_modified=rom.updated_at,
rap="",
download_rap_file="",
file_size=rom.fs_size_bytes,
sha_256=rom.sha1_hash or "",
)

txt_lines.append(
"\t".join(
[
pkgj_item.title_id,
pkgj_item.region,
pkgj_item.name,
pkgj_item.download_link,
pkgj_item.content_id,
last_modified,
pkgj_item.rap,
pkgj_item.download_rap_file,
str(pkgj_item.file_size),
pkgj_item.sha_256,
]
)
)

return _text_response(txt_lines, "pkgj_psp_dlc.txt")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This endpoint is intended to provide a feed of DLCs, but it fetches all ROMs for the platform without any filtering (db_rom_handler.get_roms_scalar(platform_ids=[platform.id])). This means it will incorrectly include games and other content types in the DLC feed.

The roms should be filtered to include only those that are actual DLCs. This could be done by checking the category of the RomFiles associated with each Rom. For example:

all_roms = db_rom_handler.get_roms_scalar(platform_ids=[platform.id])
roms = [rom for rom in all_roms if any(f.category == RomFileCategory.DLC for f in rom.files)]

A more efficient approach would be to filter at the database level if possible. This issue also applies to the other games and dlc feeds, which should similarly filter for their respective content types.

Additionally, there's a naming inconsistency. The function is pkgj_psp_dlcs_feed (plural), but the endpoint path is /pkgj/psp/dlc (singular) and the filename is pkgj_psp_dlc.txt (singular). For consistency with the games endpoints, consider using the plural form dlcs in the path and filename as well.

Comment on lines +313 to +343
def test_pkgj_psp_dlc_feed(
client: TestClient, access_token: str, platform: Platform, rom: Rom
):
platform = db_platform_handler.update_platform(
platform.id,
{"name": "PlayStation Portable", "slug": UPS.PSP, "fs_slug": UPS.PSP},
)
db_rom_handler.update_rom(
rom.id,
{
"platform_id": platform.id,
"name": "Test PSP DLC",
"fs_name": "Test PSP DLC.pkg",
"fs_name_no_tags": "Test PSP DLC",
"fs_name_no_ext": "Test PSP DLC",
"fs_extension": "pkg",
"fs_path": f"{platform.slug}/roms",
"fs_size_bytes": 123,
"sha1_hash": "deadbeef",
"regions": ["US"],
},
)

response = client.get(
"/api/feeds/pkgj/psp/dlc",
headers={"Authorization": f"Bearer {access_token}"},
)
assert response.status_code == status.HTTP_200_OK
assert response.headers["content-disposition"] == "filename=pkgj_psp_dlc.txt"
assert "Test PSP DLC" in response.text

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test for the DLC feed is not specific enough. It currently asserts that a ROM named 'Test PSP DLC' is in the response, but it doesn't ensure that only DLCs are present. The test ROM isn't configured as a DLC, so this test would pass even with the filtering bug in the endpoint.

To make this test more robust, you should:

  1. Set up test data where some ROMs are explicitly categorized as DLCs (e.g., by adding a RomFile with category=RomFileCategory.DLC) and others are not.
  2. Assert that the DLC feed contains the DLC ROM.
  3. Assert that the DLC feed does not contain the non-DLC ROM.

This would verify that the endpoint's filtering logic is working correctly. This same principle applies to the other new pkgj feed tests.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 4, 2026

Test Results

800 tests  ±0   799 ✅ ±0   2m 3s ⏱️ -3s
  1 suites ±0     1 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 7cc2097. ± Comparison against base commit 0b0756f.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 4, 2026

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
12926 8510 66% 0% 🟢

New Files

No new covered files...

Modified Files

File Coverage Status
backend/endpoints/feeds.py 21% 🟢
backend/endpoints/responses/feeds.py 94% 🟢
TOTAL 57% 🟢

updated for commit: 7cc2097 by action🐍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant