fix(list_datasets): 改讀內建 dataset 目錄,移除壞掉的 /datalist 呼叫#7
Merged
Conversation
問題:list_datasets 工具與 ChatGPT Action 都呼叫 /api/v4/datalist 想取得 「所有 dataset 名稱」,但該 endpoint 實際回傳的是國家清單 (["Canda","China","Euro","Japan","Taiwan","UK"]),並非 dataset。 FinMind 根本沒有「列出所有 dataset」的 API;/datalist 的真正用途是列出 某 dataset 底下的 data_id。此 bug 已隨 PyPI 0.0.2 上線:MCP 使用者呼叫 list_datasets 會拿到無意義的國家名單,Custom GPT 呼叫 listDatasets 也拿到 同樣結果而退回用知識庫猜測回答。 修正方向:dataset 總表本來就由 knowledge/datasets.md 提供(與 Custom GPT knowledge bundle 同一個 SSOT),list_datasets 改為讀取該檔,不再打 API。 變更檔案: - src/finmind_mcp/knowledge.py:新增 dataset_catalog(),解析 datasets.md 的 ## 分類 / ### dataset / Tier / 描述,回傳 90 筆有序記錄;datasets.md 缺檔時回 [](CI 無 knowledge/ 時可優雅降級)。 - src/finmind_mcp/tools.py:_list_datasets 改呼叫 knowledge.dataset_catalog() 並依分類輸出名稱+層級+說明;不再呼叫 client;更新工具描述與模組 docstring。 - src/finmind_mcp/client.py:移除 list_datasets()(壞掉的 /v4/datalist 呼叫)。 - chatgpt/openapi.yaml:移除 listDatasets operation、/v4/datalist path 與 DatasetList schema;只保留 queryDataset(GPT 用 knowledge bundle 認 dataset)。 - knowledge/instructions.md:移除「/datalist — 列出可用 dataset」錯誤敘述, 改註明 FinMind 無此 API、完整清單見 knowledge_bundle.md。 - docs/spec.md、docs/mcp-original-readme.md:同步修正工具↔endpoint 對照表。 測試: - tests/test_knowledge.py(新增):驗 dataset_catalog 解析 ≥80 筆、含必要欄位、 Tier 說明 legend 不產生記錄、TaiwanStockPrice 為 Free。 - tests/test_tools.py:list_datasets 測試改驗從知識庫讀取且 client.calls 為空; FakeClient 移除 list_datasets/list_result。 - tests/test_client.py:移除 test_list_datasets;test_token_from_env 改用 query_dataset 驗證 env token。 驗證:32 passed、smoke OK、build_instructions 7986/8000 chars、openapi 驗證通過。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
問題(已隨 PyPI 0.0.2 上線)
list_datasets工具與 ChatGPT Action 的listDatasets都呼叫/api/v4/datalist想取得「所有 dataset 名稱」,但該 endpoint 實際回傳的是國家清單:FinMind 沒有「列出所有 dataset」的 API;
/datalist的真正用途是列出某 dataset 底下的data_id。後果:list_datasets拿到無意義的國家名單。listDatasets拿到同樣結果,於是退回用知識庫「猜」dataset 來回答(實測在 GPT builder preview 重現)。修正
dataset 總表本來就由
knowledge/datasets.md提供(與 Custom GPT knowledge bundle 同一個 SSOT)。list_datasets改為讀取該檔,不再打 API。src/finmind_mcp/knowledge.pydataset_catalog()解析 datasets.md(90 筆,含分類/層級/說明),缺檔回[]src/finmind_mcp/tools.py_list_datasets改讀 catalog、不呼叫 client;更新工具描述src/finmind_mcp/client.pylist_datasets()(/v4/datalist)chatgpt/openapi.yamllistDatasetsoperation //v4/datalistpath /DatasetListschemaknowledge/instructions.mddocs/spec.md,docs/mcp-original-readme.md測試
tests/test_knowledge.py(新增):驗 catalog 解析 ≥80 筆、欄位完整、legend 不產生記錄、TaiwanStockPrice 為 Free。tests/test_tools.py:list_datasets改驗從知識庫讀取且未呼叫 client。tests/test_client.py:移除test_list_datasets;test_token_from_env改用query_dataset。驗證:
32 passed、smoke OK、build_instructions7986/8000 chars、openapi 驗證通過。發版
合併後需下新 tag(如
v0.0.3)觸發 CICD 重新發 PyPI,修正才會到使用者端。🤖 Generated with Claude Code