🦉 Beifong: Your Junk-Free, Personalized Information and Podcasts

Beifong manages your trusted articles and social media platform sources. It generates podcasts from the content you trust and curate. It handles the complete pipeline, from data collection and analysis to the production of scripts and visuals.

▶️ Watch demo video HD

▶️ Watch the demo on YouTube

🔗 Blog

Getting Started
How to Use Beifong
- Three Usage Methods
Content Processing System
- Built-in Content Processors
- Creating Custom Content Processors
AI Agent and Tools
Web Search and Browser Automation
Social Media Monitoring
Audio and Voice Generation
- Supported TTS Engines
- Adding New Voice Engines
Integrations
Data Storage and File Management
Deployment and Access Options
Cloud Options
- Beifong Cloud Features
Troubleshooting
Updates

Getting Started

System Requirements

Before installing Beifong, ensure you have:

Python 3.11+
Redis Server
OpenAI API key
(Optional) ElevenLabs API key

Initial Setup and Installation

# Clone the repository
git clone https://github.com/arun477/beifong.git
cd beifong

# Create virtual environment
cd beifong
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Install browser
python -m playwright install

# (Optional but recommended) Download demo content
# Navigate to the beifong directory if not already there
cd beifong  # Skip if already in the beifong folder
# This populates the system with sample data, curated source feeds, and assets
python bootstrap_demo.py

Environment Configuration

Create a .env file in the /beifong directory with your API keys:

OPENAI_API_KEY=your_openai_api_key
ELEVENSLAB_API_KEY=your_elevenlabs_api_key  # Optional
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_DB=0

Starting the Application

Launch all required services in separate terminals (but make sure you start python main.py first before starting others, because the first time run will do db initialization):

⚠️ Make sure to activate the virtual environment in all terminals before starting each script.

source venv/bin/activate

# Terminal 1: Start the main backend (first time run may take 2 to 3 minutes due to the setup process)
cd beifong
python main.py

# Terminal 2: Start the scheduler
cd beifong
python -m scheduler

# Terminal 3: Start the chat workers
cd beifong
python -m celery_worker

# Verify Redis is running
redis-cli ping

Optional: Frontend Development Mode

# Navigate to web directory
cd web

# Install dependencies
npm install

# Start development server
npm start

How to Use Beifong

Three Usage Methods

Beifong offers flexibility in how you interact with the system:

Interactive Web UI - Web interface for content management and podcast generation
API Integration - Programmatic access for custom applications and workflows
Automated Scheduling - Set up recurring tasks for hands off content processing

Content Processing System

Built-in Content Processors

Beifong includes several specialized processors for different content sources:

RSS Feed Processor - Monitors RSS feeds for new articles and content
URL Content Processor - Extracts and processes content from web pages
AI Content Analyzer - Categorizes, summarizes, and analyzes content quality
Vector Embedding Processor - Creates searchable vector representations of content
FAISS Search Indexer - Builds search indices for content discovery
Podcast Script Generator - Creates complete podcast episodes from curated content
X.com Social Processor - Crawls and processes your X.com social media feed
Facebook Social Processor - Crawls and processes your Facebook social media feed

Creating Custom Content Processors

Extend Beifong's capabilities by adding your own content processors:

Step 1: Create Your Processor Module

# processors/my_custom_processor.py
def process_custom_task(parameter1=None, parameter2=None):
    # Your processing logic here
    stats = {"processed": 0, "success": 0, "errors": 0}
    # Processing implementation
    return stats

if __name__ == "__main__":
    stats = process_custom_task()
    print(f"Processed: {stats['processed']}, Success: {stats['success']}")

Step 2: Register Your Processor

Add your processor to the system in models/tasks_schemas.py:

class TaskType(str, Enum):
    # Existing task types...
    my_custom_processor = "my_custom_processor"

TASK_TYPES = {
    # Existing types...
    "my_custom_processor": {
        "name": "My Custom Processor",
        "command": "python -m processors.my_custom_processor",
        "description": "Performs custom processing task",
    },
}

Step 3: Deploy Your Processor

Create a new task using the API or UI with your custom processor type.

AI Agent and Tools

Agent Architecture Overview

Beifong's AI system is built on the agno framework and includes:

Search Tools - Semantic search, keyword search, and browser-based web research
Content Generation Tools - Automated script writing, banner creation, and audio production
Persistent Session State - Maintains conversation context across interactions
Tool Orchestration - Manages multi step workflows automatically

Adding Custom Tools

Extend the agent's capabilities with custom tools:

# tools/my_custom_tool.py
from agno.agent import Agent

def my_custom_tool(agent: Agent, param1: str, param2: str) -> str:
    """Tool description here"""
    agent.session_state["my_key"] = "my_value"
    # Tool implementation
    result = f"Processed {param1} and {param2}"
    return result

Register your tool in services/celery_tasks.py:

# Add import
from tools.my_custom_tool import my_custom_tool
# Add to tools list
tools = [my_custom_tool]

Configuring Agent Behavior

Modify the agent's instructions and behavior in db/agent_config_v2.py:

# Update the instructions to modify the agent's behavior
# Be careful to preserve the core flow stages while adding your customizations

Web Search and Browser Automation

Beifong's search agent has full browser automation capabilities through the browseruse library, enabling web research and automated data collection from any website.

Search Commands

You can give the agent specific search instructions like:

"Go to my X.com and collect top positive and informative feeds"
"Browse Reddit for discussions about AI developments this week"
"Search LinkedIn for recent posts about data science trends"
"Visit news sites and gather articles about renewable energy"

The agent will navigate websites, interact with page elements, and extract the requested information automatically.

Social Media Login Sessions

For websites requiring authentication (X.com, Facebook, LinkedIn, etc.), you need to establish logged in sessions:

Setting Up Social Media Sessions:

Navigate to Social Tab in the Beifong web interface
Click "Setup Session" under the Setup section
Login Process - A browser window will open where you:
- Log into your social media accounts normally
- Complete any verification steps
- Close the browser when finished
Session Persistence - Beifong will use these authenticated sessions for future automated searches

Advanced Persistent Session Configuration

For persistent logged in sessions and advanced browser management:

Persistent Session Path Configuration:

Default browser sessions are stored in browsers/playwright_persistent_profile_web folder
For persistent session paths, modify tools/web_search to use get_browser_session_path() from db/config.py

Important Persistent Session Management Notes:

Avoid Concurrent Usage - Ensure no other processes use the same browser session simultaneously
Social Monitor Processors typically use the path from get_browser_session_path() function
Disable Conflicting Processes - Switch off social monitoring in the Voyager section if using persistent session paths
Future Separation - Session management will be separated into individual sessions in upcoming updates

Persistent Session Troubleshooting:

If login sessions expire, repeat the Social Tab setup process
Clear browser data if experiencing authentication issues
Ensure only one process accesses browser sessions at a time

Social Media Monitoring

Supported Platforms

Beifong currently supports automated monitoring for:

X.com (Twitter) - Collects and analyzes your social media feeds
Facebook.com - Monitors your Facebook timeline and interactions

Setting Up Scheduled Feed Collection

To automatically collect your social media feeds:

Navigate to the Voyager Tab in the Beifong web interface
Create a Scheduled Task for social media monitoring
Configure Collection Frequency - Set how often you want feeds collected
Select Platform - Choose between X.com or Facebook.com processors

Viewing AI Insights

Once your social media feeds are collected:

Navigate to the Social Tab in the web interface
View Comprehensive Analysis - Each post is analyzed through AI providing:
- Content sentiment analysis
- Topic categorization
- Engagement insights
- Relevance scoring
Browse Full Insights - Detailed analytics for all collected social media content

Configuring Custom Feeds

You can easily customize which feeds to monitor:

Modifying Feed Sources:

Navigate to /tools/social/ directory
Update the URLs in the social media processors
Monitor Specific Profiles - Configure to track particular X.com profiles or Facebook pages
Custom Feed Types - Adapt URLs for different types of content feeds

URL Configuration Examples:

Track specific X.com user: Modify URLs to target particular profiles
Monitor Facebook pages: Configure URLs for specific Facebook feeds
Custom hashtag monitoring: Set URLs to track specific hashtags or topics

Adding New Social Media Accounts

Beifong supports easy expansion to additional platforms:

Currently Supported:

X.com (Twitter)
Facebook.com

Easy Integration Options:

LinkedIn
Reddit
Other Platforms - Most social media platforms can be integrated using the same framework, but you must write a custom scraper or use an API for it.

Future Updates:

Next version will include more built-in connectors for popular social media platforms
Support for multiple account management per platform

Scheduling Best Practices

Important Scheduling Considerations:

⚠️ Avoid Concurrent Execution - When scheduling multiple social media feed collection tasks, ensure they don't run simultaneously. All social media processors share the same persistent browser session.

Recommended Scheduling Approach:

Stagger Collection Times - Schedule X.com and Facebook.com collection at different times
Allow Processing Gaps - Leave sufficient time between different social media tasks
Monitor Execution Times - Track how long each collection takes to avoid overlaps

Example Safe Scheduling:

X.com feed collection: Every 2 hours at :00 minutes
Facebook.com feed collection: Every 2 hours at :30 minutes

Future Improvements:

Next version will provide separate persistent browser sessions for each social media account
This will eliminate the need for careful scheduling and allow concurrent collection from multiple platforms

Audio and Voice Generation

Supported TTS Engines

Beifong supports multiple text to speech options:

Commercial Options:

OpenAI TTS
ElevenLabs

Open Source Options:

Kokoro

Adding New Voice Engines

The TTS system supports integration of additional engines:

Potential Next Open Source Integration Options:

Dia TTS
CSM
Orpheus-TTS

Add custom TTS engines through the tts_selector engine interface in the utils directory.

Integrations

Beifong can be integrated with other platforms.

Slack Integration

Beifong's Slack integration enables you to interact with the AI agent directly from your Slack workspace. Each conversation with Beifong creates a dedicated Slack thread for the session.

Key Feature:

Direct messaging with BeifongAI in Slack channels

Setting Up Slack App

To integrate Beifong with your Slack workspace, you need to create a Slack app in Socket Mode:

Step 1: Create Slack App

Visit Slack API Apps and click "Create New App"
Choose "From scratch" and provide:
- App Name: BeifongAI (or your preferred name)
- Workspace: Select your target Slack workspace
Enable Socket Mode:
- Navigate to "Socket Mode" in the left sidebar
- Toggle "Enable Socket Mode" to ON
- Generate an App-Level Token with connections:write scope
- Save the App-Level Token (this is your SLACK_APP_TOKEN)

Step 2: Configure Bot User

Navigate to "OAuth & Permissions" in the left sidebar
Scroll to "Bot Token Scopes" and add the required permissions (see next section)
Click "Install to Workspace" and authorize the app
Copy the Bot User OAuth Token (this is your SLACK_BOT_TOKEN)

Step 3: Enable Event Subscriptions

Navigate to "Event Subscriptions" in the left sidebar
Toggle "Enable Events" to ON
Add the required bot events (see permissions section below)

Required Slack Permissions

Your Slack app requires specific permissions to function properly with Beifong:

OAuth & Permissions - Bot Token Scopes

Add the following scopes under "OAuth & Permissions" → "Bot Token Scopes":

app_mentions:read - View messages that directly mention @BeifongAI in conversations that the app is in
assistant:write - Allow BeifongAI to act as an App Agent
channels:history - View messages and other content in public channels that BeifongAI has been added to
channels:read - View basic information about public channels in a workspace
chat:write - Send messages as @BeifongAI
files:read - View files shared in channels and conversations that BeifongAI has been added to
files:write - Upload, edit, and delete files as @BeifongAI
im:read - View basic information about direct messages that BeifongAI has been added to
im:write - Start direct messages with people

Event Subscriptions - Bot Events

Under "Event Subscriptions" → "Subscribe to bot events", add:

app_mention - Subscribe to only the message events that mention your app or bot
- Required Scope: app_mentions:read
message.channels - A message was posted to a channel
- Required Scope: channels:history

Environment Configuration

Add your Slack tokens to the .env file in the /beifong directory:

# Existing environment variables...
OPENAI_API_KEY=your_openai_api_key
ELEVENSLAB_API_KEY=your_elevenlabs_api_key  # Optional

# Slack Integration
SLACK_BOT_TOKEN=xoxb-your-bot-user-oauth-token
SLACK_APP_TOKEN=xapp-your-app-level-token

# Redis configuration
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_DB=0

Running Slack Integration

Once you've configured your Slack app and environment variables:

Step 1: Install App in Workspace

Ensure your Slack app is installed in your workspace
Add BeifongAI to the channels where you want to use it
You can also send direct messages to BeifongAI

Step 2: Start Slack Integration

# Navigate to beifong directory
cd beifong

# Ensure your environment is activated
source venv/bin/activate

# Run the Slack integration script
python -m integrations.slack.chat

Step 3: Interact with BeifongAI

In Slack Channels:

Mention @BeifongAI to start a conversation
Each mention creates a new thread for context continuity
Example: @BeifongAI Can you help me analyze the latest news about AI developments?

Reference Documentation:

Slack Socket Mode API

Data Storage and File Management

Database Storage

All application databases are organized in the databases directory for easy management and backup.

Media Asset Storage

Generated podcasts, audio files, and visual assets are stored in the podcasts directory.

Managing Storage Growth

If asset storage grows, consider these storage optimization strategies:

Cloud Storage Integration:

Use s3fs to mount an S3 bucket as a local folder for media assets
Configure custom storage paths in .env to use larger drives

Automated Cleanup:

Set up periodic archiving of older podcast episodes
Implement automated cleanup for temporary recordings and unused assets
Configure retention policies for different types of content

Storage Monitoring:

Monitor disk usage as your content library grows
Set up alerts for storage capacity thresholds

Note: More efficient storage management and cloud connectors will be added in the next version.

Deployment and Access Options

Local Network Access

# Start the backend with network access
cd beifong
python main.py --host 0.0.0.0 --port 7000

This makes the application accessible via your machine's IP address on your local network.

Remote Access Solutions

For accessing Beifong from outside your local network (workaround):

SSH Port Forwarding

# Forward local port to remote machine
ssh -L 7000:localhost:7000 username@your-server-ip

Ngrok Tunneling

# Create temporary public tunnel
ngrok http 7000

Provides a temporary public URL that forwards to your local instance.

Security

Beifong doesn't include an authentication layer yet. Authentication will be added in the next version.

Cloud Options

Beifong Cloud Features

Coming Soon!

✅ Cloud version of Beifong

✅ More social media connectors

✅ More API options. Claude, Gemini, OpenAI, Ollama

✅ Podcast customization with more styles

✅ More voice options

✅ Better data collection and storage management

✅ Authentication layer

Troubleshooting

Kokoro Library Installation Issues

If your installation fails due to the Kokoro library, you can skip installing this library and only install it when needed as a TTS engine. Kokoro is optional and only required if you want to use it for text-to-speech generation.

For more information about Kokoro, check the reference: https://github.com/hexgrad/kokoro

Browseruse Installation Issues

If your installation fails due to browseruse, make sure the Playwright version is properly installed. Browser automation features depend on Playwright being correctly set up.

For more reference and troubleshooting: https://github.com/browser-use/browser-use

FAISS Library Installation Issues

If the FAISS library installation fails, you can safely ignore this error and skip installing FAISS. This library is only required if you want to use the semantic search feature. If you don't need semantic search functionality, you can safely ignore the FAISS installation failure.

For reference: https://github.com/facebookresearch/faiss

Browser-Based Data Collection Issues

Some of the data collection features rely on browser automation, which sometimes won't work properly in server environments. While Beifong will still function, some browser dependent features may not work in server environments without proper browser setup.

Updates

🚀 Repo

Name		Name	Last commit message	Last commit date
Latest commit History 168 Commits
beifong		beifong
web		web
.gitignore		.gitignore
LICENSE		LICENSE
readme.md		readme.md

License

arun477/beifong

Folders and files

Latest commit

History

Repository files navigation

🦉 Beifong: Your Junk-Free, Personalized Information and Podcasts

Table of Contents

Getting Started

System Requirements

Initial Setup and Installation

Environment Configuration

Starting the Application

Optional: Frontend Development Mode

How to Use Beifong

Three Usage Methods

Content Processing System

Built-in Content Processors

Creating Custom Content Processors

Step 1: Create Your Processor Module

Step 2: Register Your Processor

Step 3: Deploy Your Processor

AI Agent and Tools

Agent Architecture Overview

Adding Custom Tools

Configuring Agent Behavior

Web Search and Browser Automation

Search Commands

Social Media Login Sessions

Advanced Persistent Session Configuration

Social Media Monitoring

Supported Platforms

Setting Up Scheduled Feed Collection

Viewing AI Insights

Configuring Custom Feeds

Adding New Social Media Accounts

Scheduling Best Practices

Audio and Voice Generation

Supported TTS Engines

Adding New Voice Engines

Integrations

Slack Integration

Setting Up Slack App

Step 1: Create Slack App

Step 2: Configure Bot User

Step 3: Enable Event Subscriptions

Required Slack Permissions

OAuth & Permissions - Bot Token Scopes

Event Subscriptions - Bot Events

Environment Configuration

Running Slack Integration

Step 1: Install App in Workspace

Step 2: Start Slack Integration

Step 3: Interact with BeifongAI

Data Storage and File Management

Database Storage

Media Asset Storage

Managing Storage Growth

Deployment and Access Options

Local Network Access

Remote Access Solutions

SSH Port Forwarding

Ngrok Tunneling

Security

Cloud Options

Beifong Cloud Features

Troubleshooting

Kokoro Library Installation Issues

Browseruse Installation Issues

FAISS Library Installation Issues

Browser-Based Data Collection Issues

Updates

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages