Usage Guide

Practical examples and workflows for using Parliament Scraper to extract and analyze Bulgarian parliamentary data.

🚀 Quick Start Workflow

Complete Data Extraction Pipeline

# 1. Extract basic parliamentary structure
php artisan parliament:scrape
php artisan committees:scrape

# 2. Extract legislative content
php artisan bills:scrape --all-committees --detailed

# 3. Extract meeting content
php artisan transcripts:scrape --all --year=2024

# 4. Transcribe video content (requires ElevenLabs API)
php artisan videos:transcribe-v2 --committee=3613 --since=2024-01-01

# 5. Analyze content with AI
php artisan analyze:transcripts --since=2024-01-01

👥 Parliament Members

Basic Member Extraction

# Extract all parliament members
php artisan parliament:scrape

What this does:

  • Fetches all current parliament members
  • Extracts names, electoral districts, political parties
  • Downloads detailed profiles including professions and email addresses
  • Stores relationships between members and their details

Export Members Data

# Export to CSV with Bulgarian text support
php artisan parliament:export-csv

Output: storage/app/parliament_members.csv with UTF-8 BOM encoding for Excel compatibility.


🏢 Committees

Committee Data Extraction

# Extract all committees and their members
php artisan committees:scrape

What this does:

  • Fetches all parliamentary committees
  • Extracts committee details (names, contact info, rules)
  • Maps members to committees with their positions
  • Tracks leadership roles (chairpersons, deputy chairs)

Committee-Specific Operations

# Export committee structure
php artisan committees:export-csv

# Generate individual committee files
php artisan committees:export-files --format=txt

# Custom output directory
php artisan committees:export-files --folder=my_committees

📄 Legislative Bills

Bill Scraping Options

# Scrape bills for specific committee
php artisan bills:scrape --committee-id=3613

# Scrape bills for all committees (comprehensive)
php artisan bills:scrape --all-committees

# Include PDF text extraction and detailed analysis
php artisan bills:scrape --all-committees --detailed

# Only download PDFs for existing bills
php artisan bills:scrape --pdf-only

Bill Monitoring

# Check for new bills in last 7 days
php artisan bills:check-new

# Check for new bills in last 30 days  
php artisan bills:check-new --days=30

# Monitor specific committee with notifications
php artisan bills:check-new --committee-id=3613 --notify

Bill Data Export

# Export all bills
php artisan bills:export-csv

# Export recent bills only
php artisan bills:export-csv --days=30

# Export bills for specific committee
php artisan bills:export-csv --committee-id=3613

📜 Meeting Transcripts

Transcript Extraction

# Scrape transcripts for specific committee and month
php artisan transcripts:scrape --committee=3613 --year=2024 --month=6

# Scrape transcripts for entire year (all 12 months)
php artisan transcripts:scrape --committee=3613 --year=2024

# Scrape transcripts for all committees (current month)
php artisan transcripts:scrape --all

Interactive Transcript Management

# Interactive committee selection with table view
php artisan transcripts:list

# List transcripts for specific committee
php artisan transcripts:list --committee=3613

# Show only downloaded transcripts
php artisan transcripts:list --downloaded

# Export transcript list to CSV
php artisan transcripts:list --export

Transcript Monitoring

# Check for new transcripts (all committees, current month)
php artisan transcripts:check-new

# Check specific committee with automatic analysis
php artisan transcripts:check-new --committee=3613 --analyze

# Send notifications for new transcripts
php artisan transcripts:check-new --notify

🎥 Video Transcription

Basic Video Transcription

# Transcribe videos for specific committee (recommended approach)
php artisan videos:transcribe-v2 --committee=3613 --since=2025-01-01

# Transcribe videos for multiple committees
php artisan videos:transcribe-v2 --committee=3613 --committee=3595 --since=2024-01-01

# Transcribe all committees (resource intensive)
php artisan videos:transcribe-v2 --all --since=2024-01-01

Targeted Video Processing

# Transcribe specific meeting
php artisan videos:transcribe-v2 --meeting=13565

# Transcribe for specific time period with limit
php artisan videos:transcribe-v2 --committee=3613 --since=2024-06-01 --limit=20

# Overwrite existing transcriptions
php artisan videos:transcribe-v2 --committee=3613 --overwrite

Video Transcription Options

# Use specific ElevenLabs model
php artisan videos:transcribe-v2 --committee=3613 --model=eleven_multilingual_v2

# Dry run to see what would be processed
php artisan videos:transcribe-v2 --committee=3613 --dry-run

# Process with high memory limit (for large videos)
php -d memory_limit=-1 artisan videos:transcribe-v2 --committee=3613

Legacy Video Processing (from downloaded files)

# Download videos first (if needed)
php artisan meetings:download-videos --committee=3613 --year=2024

# Transcribe from downloaded files
php artisan videos:transcribe --use-files --directory=/path/to/videos

🤖 AI Analysis

Transcript Analysis

# Analyze recent unanalyzed transcripts (default: 10)
php artisan analyze:transcripts

# Analyze all transcripts
php artisan analyze:transcripts --all

# Analyze specific transcript IDs
php artisan analyze:transcripts --ids=1 --ids=2 --ids=3

# Analyze transcripts from specific date
php artisan analyze:transcripts --since=2024-01-01

# Analyze transcripts from specific committee
php artisan analyze:transcripts --committee=3613

# Re-analyze already analyzed transcripts
php artisan analyze:transcripts --reanalyze

# Dry run to see what would be analyzed
php artisan analyze:transcripts --dry-run

Protocol Extraction

# Extract from specific transcripts
php artisan transcripts:extract --transcript=1 --transcript=2

# Extract from committee transcripts
php artisan transcripts:extract --committee=3613

# Extract from date range
php artisan transcripts:extract --from=2024-01-01 --to=2024-12-31

# Extract specific type of information
php artisan transcripts:extract --type=bill_discussions

# Force re-processing of already extracted transcripts
php artisan transcripts:extract --force

📊 Data Export & Analysis

Advanced Transcript Export

# Interactive committee selection for export
php artisan transcripts:export-committee

# Export specific committees in different formats
php artisan transcripts:export-committee --committee=3613 --format=html

# Export with date range and metadata
php artisan transcripts:export-committee --from=2024-01-01 --to=2024-12-31 --include-metadata

# Create separate files per transcript
php artisan transcripts:export-committee --separate-files --include-analysis

Export for External Analysis

# Export transcripts for analysis
php artisan transcripts:export-analysis --committee=3613

# Export in JSON format with AI analysis
php artisan transcripts:export-analysis --format=json --include-analysis

# Custom output directory
php artisan transcripts:export-analysis --output=research_data

🔄 Automated Workflows

Daily Monitoring Setup

Create a shell script for daily parliamentary monitoring:

#!/bin/bash
# daily_monitor.sh

echo "Starting daily parliamentary monitoring..."

# Check for new bills
php artisan bills:check-new --days=1

# Check for new transcripts  
php artisan transcripts:check-new

# Analyze any new transcripts
php artisan analyze:transcripts --since=$(date -d '1 day ago' +%Y-%m-%d)

echo "Daily monitoring complete."

Weekly Data Refresh

#!/bin/bash
# weekly_refresh.sh

echo "Starting weekly data refresh..."

# Update parliament members (in case of changes)
php artisan parliament:scrape

# Update committee memberships
php artisan committees:scrape

# Check for new bills in last week
php artisan bills:check-new --days=7

# Export updated data
php artisan parliament:export-csv
php artisan committees:export-csv
php artisan bills:export-csv --days=7

echo "Weekly refresh complete."

📋 Common Use Cases

Research Workflow

# 1. Set up base data
php artisan parliament:scrape
php artisan committees:scrape

# 2. Focus on specific research period
php artisan transcripts:scrape --committee=3613 --year=2024
php artisan bills:scrape --committee-id=3613 --detailed

# 3. Analyze content
php artisan analyze:transcripts --committee=3613
php artisan transcripts:extract --committee=3613

# 4. Export for analysis
php artisan transcripts:export-analysis --committee=3613 --include-analysis

Journalism Workflow

# 1. Monitor recent activity
php artisan bills:check-new --days=30
php artisan transcripts:check-new

# 2. Deep dive on specific topics
php artisan videos:transcribe-v2 --committee=3613 --since=2024-01-01
php artisan analyze:transcripts --committee=3613 --since=2024-01-01

# 3. Export findings
php artisan transcripts:export-committee --committee=3613 --format=html --include-analysis

Academic Research Workflow

# 1. Comprehensive data collection
php artisan parliament:scrape
php artisan committees:scrape
php artisan bills:scrape --all-committees --detailed

# 2. Multi-year transcript analysis
for year in 2022 2023 2024; do
  php artisan transcripts:scrape --all --year=$year
done

# 3. AI analysis across time periods
php artisan analyze:transcripts --all
php artisan transcripts:extract --from=2022-01-01 --to=2024-12-31

# 4. Structured data export
php artisan transcripts:export-analysis --format=json --include-analysis

⚠️ Performance Tips

Memory Management

# For large operations, increase memory limit
php -d memory_limit=-1 artisan videos:transcribe-v2 --all

# Process in smaller batches
php artisan transcripts:scrape --committee=3613 --year=2024 --month=1
php artisan transcripts:scrape --committee=3613 --year=2024 --month=2
# ... continue for each month

API Rate Limiting

# Add delays between operations for large datasets
php artisan bills:scrape --all-committees --detailed
sleep 60
php artisan transcripts:scrape --all --year=2024

Efficient Processing

# Use dry-run first to estimate scope
php artisan videos:transcribe-v2 --committee=3613 --dry-run

# Then process with appropriate limits
php artisan videos:transcribe-v2 --committee=3613 --limit=50

🔍 Troubleshooting Common Issues

API Connection Issues

# Test basic connectivity
curl -I "https://www.parliament.bg/api/v1/coll-list-ns/bg"

# Check if specific committee exists
php artisan committees:scrape --limit=1

Memory Issues

# Check current memory usage
php -i | grep memory_limit

# Run with unlimited memory for large operations
php -d memory_limit=-1 artisan your:command

ElevenLabs API Issues

# Verify API key is set
php artisan tinker
# In tinker: config('services.elevenlabs.api_key')

# Test with small transcript first
php artisan videos:transcribe-v2 --meeting=13565 --dry-run

Database Issues

# Check database connection
php artisan migrate:status

# Reset and rebuild if needed
php artisan migrate:fresh
php artisan parliament:scrape