Thursday, April 2
706 chunks
25 topics
Topics discovered via unsupervised clustering of transcript embeddings.
Airtime Distribution
Each segment = one topic cluster
Captures live streams in 12-hour chunks. Transcribes with Whisper large-v2, chunks at sentence boundaries (~400 tokens). BERTopic (sentence-transformers + UMAP + HDBSCAN) clusters transcript chunks to discover topics. Qwen 7B summarizes each cluster and filters infrastructure content. Refreshes daily.