Skip to content

feat: Add school_level column (985/211/双一流) to major export#2

Merged
freecho merged 2 commits intomainfrom
copilot/update-major-add-school-level
Apr 1, 2026
Merged

feat: Add school_level column (985/211/双一流) to major export#2
freecho merged 2 commits intomainfrom
copilot/update-major-add-school-level

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 1, 2026

Adds a school_level column to the major table and CSV export, populated at insert time based on the school name. Values are slash-joined tier labels (e.g. 985/211/双一流); empty string if unmatched.

Changes

  • data/school_level.py (new): Three frozen sets (SCHOOLS_985, SCHOOLS_211, SCHOOLS_SHUANGYILIU) and a lookup function:
    def get_school_level(school_name: str) -> str:
        name = school_name.strip() if school_name else ""
        levels = []
        if name in SCHOOLS_985:       levels.append("985")
        if name in SCHOOLS_211:       levels.append("211")
        if name in SCHOOLS_SHUANGYILIU: levels.append("双一流")
        return "/".join(levels)
  • data/entity.py: school_level = Column(String(50)) inserted immediately after school_name — preserves column order in CSV export (which reads Major.__table__.columns directly).
  • data/db.py: insert() calls get_school_level(item.get("dwmc", "")) to populate the field on every record.
  • yzw.sql: school_level varchar(50) added after school_name in the DDL.
  • README.md: New "导出字段说明" section documents the field and its possible values.
Original prompt

在仓库 freecho/yzw 中实现 issue #1https://github.com/freecho/yzw/issues/1):在导出的专业数据中,在学校名称列(school_name)后新增一列“学校层级”,支持显示 985/211/双一流 标签。

要求:

  1. 数据库层面:更新 yzw.sql,在 major 表中添加新字段(建议名:school_level 或 school_tier),并确保字段顺序在 school_name 之后。
  2. ORM 层面:更新 data/entity.py 的 Major 模型,新增对应 Column。
  3. 入库逻辑:更新 data/db.py 的 insert(),根据学校名(dwmc)为每条记录填充学校层级字段。
  4. 映射数据:在仓库中新增一个学校名->层级的映射数据文件(例如 data/school_level.py 或 data/school_level.json),并在代码中读取/使用。
    • 若一个学校属于多个层级,导出值使用斜杠拼接,例如 "985/211/双一流"。
    • 若未匹配到,填空字符串。
  5. 导出:export_major_csv.py 当前自动导出 Major 的所有列;确保新增列会出现在 CSV 表头中且位置在 school_name 后。
  6. 文档:更新 README.md 说明新增字段。

实现细节建议:

  • 将映射字典封装为一个函数 get_school_level(school_name: str) -> str,便于后续维护��
  • 注意中文学校名称的常见别名/空格差异,至少做 strip() 处理。

交付:

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: @freecho/yzw/issues/1 你能直接帮我实现这个issue提到的功能,并且直接提交新的commit吗?或者你发起pull request?

Assistant: 可以实现,而且我建议用**“直接提交 commit 到 main”**的方式来集成(你也说了“直接提交新的commit”可以接受)。

基于仓库当前结构,你的数据导出是通过 export_major_csv.py 直接把 Major 表的字段导出为 CSV(它会读取 data/entity.pyMajor.__table__.columns 作为表头与字段列表),同时数据库表结构在 yzw.sql 里定义。也就是说:要让“学校层级列”稳定地出现在导出的 CSV 里,最合理的是把它做成数据库字段 + ORM 字段,这样爬取入库、SQL 表结构、CSV 导出都会统一对齐。当前 major 表只有 school_name,没有“层级”列。export_major_csv.py 也不会凭空多导出一个计算列。

落地实现通常会包含这些改动点(对应你 issue 的“在学校列后加一个学校层级列,985/211/双一流”):

  1. 数据库表结构(yzw.sql
    school_name 后新增一列,例如:school_tier / school_level(建议 school_levelschool_tags 更通用),用于存 985/211/双一流 这类标签。

  2. ORM 实体(data/entity.py
    Major 增加对应的 Column(String(...)) 字段。因为导出脚本是按 ORM 的列顺序导出,这里也能保证“在学校列后面”。

  3. 入库逻辑(data/db.pyinsert()
    目前插入时只写 school_name=item.get("dwmc")。需要在这里根据 dwmc(学校名)查一个映射,写入 school_level 字段。

  4. 映射数据维护方式(建议新增一个数据文件/模块)
    你提到“本地这边让 AI 帮我加上了”,如果你愿意把那份 mapping(学校名 -> 985/211/双一流)整理成文件提交到仓库里,会非常好集成,也更准确。项目里目前没有任何“学校层级”数据源,因此需要新增一个映射表(例如 data/school_level.pydata/school_level.json),由代码读取。

  5. README(可选但推荐)
    在“数据导出工具”或字段说明里补一句:导出的 CSV 会新增 school_level(或你命名的字段),值为 985/211/双一流/空(空表示未匹配到或非这些类别)。

如果你希望我直接照你本地那份 mapping 集成,你把它贴出来(或者给个文件内容:JSON/CSV/py dict 都行),我就能按你的数据标准实现:

  • 字段命名你希望叫啥(school_level/school_tier/school_tags)?
  • 如果一个学校同时属于多个(常见情况:既是 985 又是 211,也属于双一流),你希望导出值是:"985/211/双一流" 这样用 / 拼接,还是优先级只保留一个?

目前我这边已经能明确:需要改的核心文件就是 yzw.sqldata/entity.pydata/db.py、(可能新增)一个映射文件,以及导出脚本会自动带上新字段(因为它读取 ORM 的列)。

User: 我希望能由你发起pull request,GitHub似乎有这个功能呢,我记得可以@然后让ai处理?

Assistant: 现在需要你在弹窗里确认/授权我刚才发起的自动改代码并创建 PR 的请求;你点“允许/确认”后我才能继续在 freecho/yzw 里完成实现并提交一个包含该功能的 Pull Request(会关联并关闭 #1)。

This pull request was created from Copilot chat.

@freecho freecho marked this pull request as ready for review April 1, 2026 10:00
@freecho freecho requested a review from Copilot April 1, 2026 10:00
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@freecho freecho requested a review from Copilot April 1, 2026 10:02
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@freecho freecho marked this pull request as draft April 1, 2026 10:04
Copilot AI changed the title [WIP] Add school level column to major data export feat: Add school_level column (985/211/双一流) to major export Apr 1, 2026
Copilot AI requested a review from freecho April 1, 2026 10:06
@freecho freecho marked this pull request as ready for review April 1, 2026 10:10
Copilot AI requested a review from freecho April 1, 2026 10:14
@freecho
Copy link
Copy Markdown
Owner

freecho commented Apr 1, 2026

@copilot review the pr

@freecho
Copy link
Copy Markdown
Owner

freecho commented Apr 1, 2026

@copilot bad review, try it adain

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 1, 2026

@copilot bad review, try it adain

I re-reviewed the PR based only on implementation correctness (not external school-list authority verification):

school_level integration is consistent and complete across all affected layers:

  • schema: yzw.sql adds school_level in the expected position
  • ORM: data/entity.py adds school_level right after school_name
  • write path: data/db.py populates it on insert via get_school_level(...)
  • export path: export_major_csv.py uses Major.__table__.columns, so the new column is exported automatically
  • docs: README explains the new column and values

get_school_level() behavior is deterministic and safe for empty/None-like input in current call path.

⚠️ Non-blocking improvement:

  • Matching is exact-string based, so aliases/variant names (e.g., brackets, extra spaces, historical renames) may miss classification. If needed, we can add a normalization/alias map in a follow-up.

No UI changes in this PR, so no UI screenshot is applicable.

Repository owner deleted a comment from Copilot AI Apr 1, 2026
@freecho freecho merged commit bd03229 into main Apr 1, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants