Videome
Cognitive Machine Learning from Digital Videos
Overview
Videome (full title: Cognitive Machine Learning and Inference Technologies for Intelligent Recommendation Services) investigated how machines can learn cognitive representations directly from streams of digital video, drawing inspiration from how the human brain constructs episodic memory and conceptual knowledge.
Digital videos provide an excellent learning substrate for teaching machines. The project used an IPTV-like game platform and EEG to study human visual and linguistic memory in a learning-by-viewing paradigm. The findings informed the development of advanced machine learning techniques designed to simulate human learning and memory. In particular, the project explored nonparametric Bayesian architectures — such as dynamic hypernetworks — that learn from sequences of digital videos of unbounded size by self-organizing cognitive networks.
The overall goal was to build an intelligent recommendation system capable of continuously acquiring structured knowledge from video streams, much as a viewer naturally learns from television, and to leverage that knowledge for proactive, personalized content recommendation.
Research Team
Principal Investigator
- Prof. Byoung-Tak Zhang (Seoul National University)
Researchers
- Jin-San Yang
- Jung-Woo Ha
- Min-Oh Heo
- Ha-Young Jang
- Byoung-Hee Kim
- Soo-Jin Kim
- Ji-Hoon Lee
- Je-Keun Rhee
- Ho-Sik Seok
- Hyunwoo Kim
- Jin-Seok Nam
- Woongchang Yoon
- Bado Lee
- Myung-Goo Kang
Contact
- Kyung-min Kim (kmkim@bi.snu.ac.kr), +82-2-880-1847
Methodology
The project was structured across five annual phases, each building on the last:
Year 1 (2010) — Interactive Home Media Platform and Recommendation System
- Designed a multimodal multigraph (MMG) framework for an interactive home media platform
- Built a Videome engine capable of managing video archives exceeding 1,000 hours
- Developed foundational algorithms for content recommendation using TV viewing logs
- Applied probabilistic graphical models with EM-based learning and Cross-Entropy-based training
Year 2 (2011) — Content Learning and Channel Recommendation via Probabilistic Graphical Models
- Developed MMG-based learning models using Bayesian inference
- Designed automatic channel detection and recognition algorithms for multimodal content
- Built channel recommendation systems using graphical model learning
- Applied particle filtering and MCMC-based probabilistic analysis for viewer behavior modeling
Year 3 (2012) — Activity-Aware Interactive Recommendation via Probabilistic Graphical Models
- Extended probabilistic Markov model-based activity inference
- Developed multichannel switching and content recommendation algorithms
- Applied Sequential Monte Carlo methods for viewer activity modeling
- Used Importance Sampling for viewer command synthesis and generation
Year 4 (2013) — Lifelong Learning and Activity Inference for Interactive Recommendation
- Built extended probabilistic graphical models for task-aware, activity-aware recommendation
- Developed Continuous Lifelong recommendation learning frameworks via Sequential Bayesian Inference
- Applied efficient analysis methods to large-scale interactive program guides
Year 5 (2014) — Proactive Recommendation Services and Optimization
- Developed proactive service recommendation combining diverse context signals
- Personalized and optimized recommendation models for specific service scenarios
- Addressed cold-start problems through novel user modeling approaches
- Extended models to incorporate new activity domains and wearable sensor data
Key Technical Contributions
- Multimodal Multigraph (MMG): a unified probabilistic graphical framework representing multi-modal video content and viewer behavior
- Dynamic Hypernetworks: nonparametric Bayesian architectures for lifelong, unbounded video stream learning
- Deep Concept Hierarchy (DCH): a model enabling progressive, incremental abstraction of visual-linguistic knowledge from video sequences
- Sequential Bayesian Inference for continuous, online adaptation to evolving viewer preferences
- Demonstrated on children’s cartoon series (e.g., Pororo) as a large-scale cartoon video corpus for vision-language learning
Project Information
| Field | Details |
|---|---|
| Full Title | Cognitive Machine Learning and Inference Technologies for Intelligent Recommendation Services |
| Duration | May 2010 – April 2015 |
| Funding | National Research Foundation (NRF) |
| Principal Investigator | Prof. Byoung-Tak Zhang |