Agent YouTube Journalism is an open-source investigative AI assistant that transcribes, summarizes, analyzes, and answers questions about videos from YouTube — especially those related to Brazilian politics and public interest.
The system uses multi-agent reasoning and Retrieval-Augmented Generation (RAG) to:
- Transcribe YouTube videos in Brazilian Portuguese (using OpenAI Whisper API)
- Summarize the transcript with DeepSeek via Groq Cloud
- Search the web for context (via DuckDuckGo)
- Highlight journalistically relevant parts of the video
- Index and answer questions based on the transcript + general knowledge if needed
🟢 Try it online: https://agentytjournalism.streamlit.app/
- User enters YouTube video URL + API keys
- The app:
- Downloads audio via
yt-dlp - Transcribes using Whisper (
openai.Audio.transcribe) - Summarizes with Groq LLM (DeepSeek)
- Searches the web for context
- Highlights journalistic investigation leads
- Indexes the transcript with FAISS
- Downloads audio via
- The user can:
- View the analysis
- Ask questions based on the video (with RAG + LLM knowledge fallback)
The system uses smolagents to structure the reasoning with a clear cycle:
Thought → Code → Observation
app.py: Main Streamlit app with two tabs: Analysis & Questionsprocess_video.py: Orchestrates full pipeline (transcription → summary → highlight → indexing)rag_question_tab.py: Handles the RAG-based Q&A flow with session stateagent_config.py: Defines tools and setup forsmolagentsagentstreamlit_app.yaml: Config file for deployment (Streamlit Community Cloud)prompts.yaml: Prompt templates used for summarization, analysis, and code reasoning
groq_model.py: Executes prompts with Groq LLM and truncates long prompts when neededlist_groq_models.py: Lists all Groq-hosted models available for querying
tools/youtube_transcriber.py: Downloads and transcribes video audio via Whisper APItools/summarization.py: Summarizes the transcript using DeepSeektools/web_search.py: Searches DuckDuckGo for current contexttools/journalistic_highlight.py: Generates public interest highlightstools/index_transcript.py: Splits transcript and indexes it with FAISStools/rag_query.py: Performs RAG query and allows fallback to general LLM knowledgetools/__init__.py: Makes the tools importable as a module
requirements.txt: All Python dependencies (tested with Python 3.12)packages.txt: System dependencies (e.g., ffmpeg, build tools)
Here are potential enhancements for the project:
- Fixed repeated video download when switching tabs
- Added session state to persist transcript/vectorstore between tabs
- Enabled mixed-source RAG answers (video + general knowledge)
- Adjusted
requirements.txtandpackages.txtfor compatibility
-
Caching
- Save FAISS vectorstore, summary, highlights to disk (
.save_local()) or usest.cache_data()
- Save FAISS vectorstore, summary, highlights to disk (
-
Better Streamlit UX
- Enable chat-style Q&A with memory
- Show progress for each processing step
- Add "Download PDF" report button
-
Model prompting
- Add clear tags like
[FACT FROM VIDEO]vs[LLM KNOWLEDGE]
- Add clear tags like
-
Testing
- Add unit/integration tests using
pytest
- Add unit/integration tests using
-
Agent orchestration
- Split into two agents:
VideoAnalysisAgentandQAAgent - Optionally adopt CrewAI or LangGraph for more complex flows
- Split into two agents:
MIT License. See LICENSE file.
Developed by Reinaldo Chaves (@reichaves) — journalist, data scientist, and investigative technologist.
Open an issue or contact via GitHub.