Detecting Internet Brain Rot with Multimodal AI
Visual-Qwen pairs a Q-Former vision encoder with Qwen3-4B and Whisper transcripts to flag “sludge” short-form videos: the stacked, multi-feed clips engineered to bypass single-modality moderators.
A frozen-projector tri-modal classifier
Visual-Qwen reads each clip across three signals at once: an EVA-CLIP frame embedding, a Q-Former cross-modal attention bottleneck, and a Whisper transcript of the audio. Qwen3-4B fuses the streams and emits a sludge / not-sludge verdict.
What is sludge?
Multi-feed short-form clips that stack unrelated content (gameplay over reaction video over text crawl) to defeat algorithmic moderation built for single coherent scenes.
What the paper shows
Three headline contributions, every number traceable to the public test split.
Cross-modal Q-Former
A 32-token attention bottleneck distills heterogeneous vision and audio signals into a single embedding the LLM can fuse.
See the architectureFrozen-projector ablation
Freezing the stage-1 Linear projector during LoRA fine-tuning beat training it by 0.77 pp. Less aligned drift, better generalization.
Read the ablationOpen 2K TikTok-sludge dataset
Two thousand short-form clips, human-validated, paired with Whisper-V3-Turbo transcripts. Released on Kaggle under an open license.
Open on KaggleUpload Your Video
Run our fine-tuned multimodal model on your own clip. It looks for sludge: short-form video that stacks unrelated streams together (think Subway Surfers under Family Guy under soap-cutting) to defeat single-modality moderators.
Hosted on Hugging Face Spaces (free CPU). Inference takes about 1 to 2 minutes per video on the default settings, longer with deep analysis.
Open in a new tabMeet the Researchers
Four dedicated CS students and their extraordinary advisor from FEU Institute of Technology, combining academic excellence with entrepreneurial vision.

Justine Jude Pura
Project Mentor

Marc Olata
Project Manager

Alpha Romer Coma
Technical Lead

Job Isaac Ong
Data Engineer

Kristoffer Ian Sioson
Data Analyst
Sponsors & Partners
Grateful for the support of organizations sharing our vision of healthier digital environments.

YouTube Researcher Program
Ethical Data Collection

TPU Research Cloud
AI Infrastructure
🤝 Become a Partner
Support cutting-edge AI research and help combat internet brain rot through strategic partnerships.