Saturday, April 4
627 chunks
22 topics
Topics discovered via unsupervised clustering of transcript embeddings.
Airtime Distribution
Each segment = one topic cluster
Captures live streams in 12-hour chunks. Transcribes with Whisper large-v2, chunks at sentence boundaries (~400 tokens). BERTopic (sentence-transformers + UMAP + HDBSCAN) clusters transcript chunks to discover topics. Qwen 7B summarizes each cluster and filters infrastructure content. Refreshes daily.