Dashboard
Active projects
3
1 paused
Notebooks
12
across all stages
URLs tracked
0
via Web URLs tab
Tasks open
4
2 blocked
Project progress
ProjectStagesProgressStatus
Samoylova Network
active
Immortal Regiment Spain
active
Propaganda Narratives 2026
paused
Database status
FileSizeSchemaLast backupIntegrity
argos_deep.sqlite1.2 GBv32026-06-21ok
argos_tracking.sqlitev1pending
Samoylova Networkactive
01 ingest ✓
02 process ✓
03 organize →
04 analyze
NotebookvLast runStatus
ARGOS_SamoyloveNetwork_01_TelegramIngest_v3v32026-06-20production
ARGOS_SamoyloveNetwork_01_WebScraping_v1v12026-06-18draft
ARGOS_SamoyloveNetwork_02_Transcription_v2v22026-06-20production
ARGOS_SamoyloveNetwork_03_NormalizationToDB_v2v22026-06-21running
Immortal Regiment Spainactive
01 ingest ✓
02 process →
03 organize
04 analyze
NotebookvLast runStatus
ARGOS_ImmortalRegiment_01_DocumentIngest_v1v12026-06-19production
ARGOS_ImmortalRegiment_02_PDFExtraction_v1v12026-06-19draft
Propaganda Narratives 2026paused
01 ingest ✓
02 process
03 organize
04 analyze
NotebookvLast runStatus
ARGOS_PropagandaNarratives_01_WebIngest_v1v12026-06-10paused
Database files
Master database
argos_deep.sqlite
Underscore only. No version in filename. Single source of truth.
Working copy (Colab local)
argos_deep_working.sqlite
Ephemeral — copied from master at session start, deleted when session ends.
Versioned backup
argos_deep_backup_YYYYMMDD_vN.sqlite
e.g. argos_deep_backup_20260621_v3.sqlite — increment N on schema change.
Nightly backup
argos_deep_backup_nightly_YYYYMMDD.sqlite
Auto-created at session end. Rolling 7-day window. Oldest deleted automatically.
✗ argos-deep.sqlite (hyphen = wrong — Python can't import hyphenated names)
✗ argos_deep_v3.sqlite (version number in master name = wrong)
Colab notebooks
Pattern
ARGOS_ProjectName_NN_WorkflowName_vX.ipynb
PascalCase project · two-digit stage number · PascalCase task · version number
Stage 01 — ingest
ARGOS_SamoyloveNetwork_01_TelegramIngest_v3.ipynb
Stage 02 — process
ARGOS_SamoyloveNetwork_02_Transcription_v2.ipynb
Stage 03 — organize
ARGOS_SamoyloveNetwork_03_NormalizationToDB_v2.ipynb
Stage 04 — analyze
ARGOS_SamoyloveNetwork_04_NetworkAnalysis_v1.ipynb
Increment version (v1 → v2) only when logic changes significantly. Formatting and comment edits don't count. Keep old versions — never delete them.
Scripts & folders
Python scripts
NN_descriptive_name.py
Leading two-digit number sets execution order. e.g. 01_mount_gdrive.py
GDrive system folders
_database _backups _scripts _docs
Underscore prefix = system/infrastructure. Never used as project names.
Project folders
samoylova_network
immortal_regiment_spain
snake_case. Lowercase only. No spaces, no hyphens.
Staging session folders
2026-06-21/session_id_abc123/
ISO date + session UUID prefix. Auto-created by backup_manager.py
argos_deep.sqlite — production dataschema v3
TableRowsKey columnsNotes
DIGITAL_CONTENT130,526id · source · content_type · text · actor_id · language15% actorId NULL ⚠
ACTORS796id · name · aliases · affiliation · confidence89% complete
NARRATIVES40id · theme · description · first_seen
TECHNIQUES35id · disarm_id · name · categoryDISARM framework
TRANSCRIPTS4,410id · content_id · full_text · language · method340 missing full_text ⚠
TIMELINE_ITEMS658id · event_date · actor_id · narrative_id · description
CLAIMS8,934id · content_id · claim_text · ptcof_score · verifiedAdded in v3
RELATIONSHIPS2,104id · actor_a · actor_b · rel_type · confidenceAdded in v3
argos_tracking.sqlite — progress & metadataschema v1
TablePurposeKey columns
projectsProject registryid · name · status · start_date · owner · objectives
notebooksNotebook trackingid · project_id · filename · stage · version · last_run · run_count
tasksTask managementid · project_id · task_name · status · priority · due_date
progress_snapshotsDaily progress logid · date · project_id · rows_ingested · completed_tasks · notes
weekly_reportsAuto-generated reportsid · week_of · project_id · completed_tasks · blockers · achievements
tracked_urlsWeb URL queueid · url · title · category · project_id · priority · status · notes
Schema changelog
VersionDateChanges
v32026-06-21Full consolidation · added CLAIMS + RELATIONSHIPS · merged argos_deep_working.sqlite · removed 271 duplicates
v22026-06-20Added CLAIMS table · added RELATIONSHIPS table · ACTORS.id migrated string → UUID
v12026-06-10Initial normalized schema from BigQuery argos_v7_1 extraction
GDrive structure/My Drive/ARGOS/
PathContainsAccess
_database/argos_deep.sqlite · README_DATABASE.txt · argos_tracking.sqliteread only
_backups/Versioned + nightly backups · BACKUP_MANIFEST.txtimmutable
_scripts/00_utilities/Shared Python modules (colab_startup, database_helpers, etc.)writable
_scripts/projects/ProjectName/01_ingest/Ingest notebooks + config.yamlwritable
_scripts/projects/ProjectName/02_process/Transcription, PDF extraction, cleaning notebookswritable
_scripts/projects/ProjectName/03_organize/Normalization, entity extraction, deduplicationwritable
_scripts/projects/ProjectName/04_analyze/Network analysis, briefing generation, STIXwritable
_scripts/templates/TEMPLATE_01..04.ipynb — copy for new workwritable
_docs/SCHEMA_v3.md · SCHEMA_CHANGELOG.md · DATA_QUALITY_ISSUES.mdwritable
staging/ProjectName/YYYY-MM-DD/session_id/Session outputs, metadata.json, session_log.txtwritable
_sources/BigQuery exports · filesystem scans — reference onlyreference
Colab session paths
PathPurposePersists?
/content/drive/MyDrive/ARGOS/GDrive mount pointyes — GDrive
/content/argos_deep_working.sqliteWorking DB copy — all session work happens heresession only
/content/staging_output/Session output files before uploadsession only
Active versions
3
v1 · v2 · v3 always kept
Nightly window
7
rolling days
Current schema
v3
2026-06-21
Total size
3.1 GB
all backups combined
Active versioned backups (keep last 3)
argos_deep_backup_20260621_v3.sqlite
Full consolidation · merged 3 databases · 1.2 GB · 2026-06-21
current
argos_deep_backup_20260620_v2.sqlite
Schema v2 — added CLAIMS + RELATIONSHIPS · 1.1 GB · 2026-06-20
keep
argos_deep_backup_20260610_v1.sqlite
Initial normalized schema from BigQuery · 0.9 GB · 2026-06-10
keep
Backup trigger rules
TriggerTypeActionRetention
Schema changeVersionedCreate _vN.sqlite before migrationKeep last 3
Bulk ingestion (>10k rows)VersionedCreate _vN.sqlite before ingestKeep last 3
Session end (data modified)NightlyAuto-create _nightly_YYYYMMDD.sqliteRolling 7 days
Any risky changeVersionedRun PRAGMA integrity_check first, then backupKeep last 3
Quick backup snippet (Colab)
import shutil, datetime

backup_name = f"argos_deep_backup_{datetime.date.today()}_v3.sqlite"
shutil.copy(
    '/content/argos_deep_working.sqlite',
    f'/content/drive/MyDrive/ARGOS/_backups/{backup_name}'
)
print(f"✓ Backup: {backup_name}")
Stage 01 — Ingestdata collection

Collect raw data from external sources. Input arrives from Telegram API, PDFs, web pages, YouTube, or local files. Output goes to staging/ProjectName/YYYY-MM-DD/session_id/.

SourceNotebook patternOutput format
Telegram APIARGOS_Project_01_TelegramIngest_vXCSV / JSON
PDF documentsARGOS_Project_01_DocumentUpload_vXRaw text + metadata
Web scrapingARGOS_Project_01_WebScraping_vXHTML / markdown
YouTubeARGOS_Project_01_YoutubeMetadata_vXJSON + subtitles
Stage 02 — Processtransformation

Transform raw data into standardized, clean format. Reads from staging session folders. Outputs processed files back to the same session folder with _processed suffix.

OperationNotebook patternUtility used
Transcription (audio/video → text)ARGOS_Project_02_Transcription_vXtranscription_helpers.py
PDF text extractionARGOS_Project_02_PDFExtraction_vXpdf_extraction_helpers.py
Text cleaning / normalizationARGOS_Project_02_TextCleaning_vX
Language detection / NERARGOS_Project_02_LanguageProcessing_vXentity_extraction.py
Stage 03 — Organizenormalization

Normalize processed data and load into argos_deep.sqlite. Deduplication, schema mapping, entity linking, referential integrity checks.

OperationNotebook pattern
Load into DB (DIGITAL_CONTENT, etc.)ARGOS_Project_03_NormalizationToDB_vX
Entity extraction to ACTORS tableARGOS_Project_03_EntityExtraction_vX
Remove duplicate rowsARGOS_Project_03_Deduplication_vX
Stage 04 — Analyzeintelligence production

Generate intelligence products from the database. Reads from argos_deep.sqlite. Outputs HTML briefings, STIX bundles, visualizations, network graphs.

OperationNotebook patternOutput
Network analysisARGOS_Project_04_NetworkAnalysis_vXGraph · PDF
Narrative mappingARGOS_Project_04_NarrativeMapping_vXReport · JSON
HTML briefingARGOS_Project_04_BriefingGeneration_vXHTML
STIX 2.1 bundleARGOS_Project_04_STIXGeneration_vXJSON bundle
Add URL
URL queue 0 URLs
All
🔍 Important
📥 To scrape
⏳ Pending
📌 Reference
🗑️ Skip

No URLs yet. Add one above.

Add task
Task board
All
Pending
In progress
Blocked
Completed
TaskProjectPriorityStatus
Log session
Recent sessions
DateProjectStageRowsStatus
2026-06-21Samoylova Network03_organize1,250success
2026-06-20Samoylova Network02_process980success
2026-06-19Immortal Regiment01_ingest342success