A sophisticated agent system that maintains state and context through the combination of persistent entity storage (SQLite) and vector-based memory (ChromaDB).
Stateful-Agent is designed to provide a robust solution for maintaining conversational context and user information across interactions. It leverages two different types of databases:
- SQLite: For structured data storage and retrieval of user information and persistent state
- ChromaDB: For vector-based storage enabling semantic search and contextual memory
- π Persistent user data storage with SQLite
- π§ Semantic memory capabilities using ChromaDB vector database
- π Case-insensitive user handling
- π« Duplicate prevention for user entries
- β Structured data validation and management
- π PDF document processing and analysis
- π Integration with various external tools (GitHub, Slack, Google)
- π Research lab collections management
- π Google Scholar paper tracking and crawling
- π¬ Paper recommendation based on research interests
- π Contextual paper summarization with related research
The agent now supports robust academic paper management and recommendation features:
- Create lab collections with persistent information (name, institution, leader, members, etc.)
- Add lab members with their Google Scholar profiles
- Track papers published by lab members
- Automatically crawl Google Scholar pages of lab members to collect their arXiv papers
- Check for new papers by lab members during conversations
- Store PDF documents in the data directory with proper organization
- Recommend relevant papers from arXiv based on the lab's research interests and time period
- Save recommended papers and their embeddings to prevent duplication
- Generate comprehensive paper summaries for specific lab member papers
- Utilize complete paper content for more thorough and accurate summaries of target papers
- Extract semantic sections (introduction, conclusion) from LaTeX source files of related papers when available
- Include contextual information from related papers in the lab collection
- Draw insights from both lab papers and recommended papers
- Provide academic-style summaries with key findings, methodologies, and relationships to existing research
The agent now supports robust academic paper management and recommendation features:
- Create lab collections with persistent information (name, institution, leader, members, etc.)
- Add lab members with their Google Scholar profiles
- Track papers published by lab members
- Automatically crawl Google Scholar pages of lab members to collect their arXiv papers
- Check for new papers by lab members during conversations
- Store PDF documents in the data directory with proper organization
- Recommend relevant papers from arXiv based on the lab's research interests and time period
- Save recommended papers and their embeddings to prevent duplication
- Generate comprehensive paper summaries for specific lab member papers
- Utilize complete paper content for more thorough and accurate summaries of target papers
- Extract semantic sections (introduction, conclusion) from LaTeX source files of related papers when available
- Include contextual information from related papers in the lab collection
- Draw insights from both lab papers and recommended papers
- Provide academic-style summaries with key findings, methodologies, and relationships to existing research
-
Create a LinkedIn Developer Application:
- Go to LinkedIn Developer Portal
- Create a new app providing:
- App Name
- Application logo
- LinkedIn company page (required, cannot be a profile page)
- Accept terms and conditions
-
Configure Application Permissions:
- Under Products, request access for:
- "Share on LinkedIn" (adds w_member_social scope)
- "Sign In with LinkedIn using OpenID Connect" (adds openid and email scopes)
- Under Products, request access for:
-
Generate Access Token: For personal use (recommended):
- Go to LinkedIn OAuth2 tools
- Generate a token with scopes: `w_member_social openid email profile'
- Note: Access tokens are valid for 60 days
-
Get User ID:
curl --location 'https://api.linkedin.com/v2/userinfo' \ --header 'Authorization: Bearer YOUR_ACCESS_TOKEN'
Save the returned user ID.
Add the following to your .env file:
LINKEDIN_USER_ID=your_user_id
LINKEDIN_ACCESS_TOKEN=your_access_token- Supports text posts with formatting and emojis
- Configurable visibility (PUBLIC or CONNECTIONS)
- Automatic error handling and validation
- Environment-based configuration
The LinkedIn publisher is available as an agent tool and can be used with the following parameters:
commentary: The content of the postvisibility: Post visibility setting ("PUBLIC" or "CONNECTIONS")
- LinkedIn access tokens expire after 60 days
βββ stateful_agent/ # Main package directory
β βββ tools/ # Tool implementations
β β βββ sqlite.py # Entity database operations
β β βββ chromadb.py # Vector database operations
β β βββ paper_crawler.py # Paper collection and recommendation tools
β β βββ linkedin_publisher.py # LinkedIn posting automation
β βββ agent.py # Core agent implementation
β βββ paper_recommendation_agent.py # Specialized paper recommendation agent
β βββ data/ # Data storage directory
β β βββ <lab_name>/ # Lab-specific paper PDFs
β β βββ recommendation/ # Recommended paper PDFs
β βββ .env # Environment configuration
β βββ .secrets.toml # Secret configuration (not tracked)
βββ frontend/ # Frontend implementation
- Python 3.11 or higher
- OpenAI API key (gpt-4o model for paper summarization and embeddings)
- Internet connection for accessing Google Scholar and arXiv
- (Optional) GitHub, Slack, or Google credentials for additional features
- Clone the repository:
git clone https://github.com/nsd9696/stateful-agent.git
cd stateful-agent- Create and activate a virtual environment:
cd stateful_agent- .env
pip install uv
uv pip install -e ".[dev]"- Create necessary configuration files in the
stateful_agentdirectory:
OPENAI_EMBEDDING_MODEL=text-embedding-3-large
OPENAI_API_KEY=YOUR_OPENAI_KEY
CHROMA_PERSIST_DIRECTORY=./chroma_langchain_db
SQLITE_DB_PATH=./sqlite_langchain_db.db
DEFAULT_DATA_DIR=./data- Prepare the environment:
cd stateful_agent
mkdir -p data/recommendation- Run the agent in terminal mode:
uv run stateful-agent deploy-agent --file agent.py --mode terminal- Run the agent in web mode:
uv run stateful-agent deploy-agent --file agent.py --mode web- Example interactions:
# Create a new research lab
> Create a lab called vision_research_lab at University of California, Berkeley, with leader Jitendra Malik
# Add members with their Google Scholar profiles
> Add member Haozhi Qi with scholar URL https://scholar.google.com/citations?user=iyVHKkcAAAAJ&hl=en to vision_research_lab
# Add research areas for the lab
> Add computer vision, machine learning and robotics as research areas for vision_research_lab
# Add website and description for the lab
> Add https://people.eecs.berkeley.edu/~malik/ as the website for vision_research_lab, and add description for the lab: Vision Intelligence
# Crawl Google Scholar for papers by lab members
> Collect papers from vision_research_lab members
# Stay updated with the lab
> Check new papers for vision_research_lab
# Get paper recommendations
> Recommend 5 papers from the last 30 days related to vision_research_lab
# Generate a paper summary
> Summarize the latest paper by Haozhi Qi from vision_research_lab
> Summarize the latest paper from vision_research_lab
# Delete the lab
> Delete the lab vision_research_lab
- Run tests:
pytest - Format code:
black . && isort . - Type checking:
mypy .
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- LangChain for the core agent capabilities
- ChromaDB for vector storage
- OpenAI for embedding and completion APIs
- arXiv for access to research papers
- Zotero-arXiv-Daily for inspiration on paper recommendation