Videome

Cognitive Machine Learning from Digital Videos

May 2010 - April 2015 National Research Foundation (NRF)

Project Website

Overview

Videome (full title: Cognitive Machine Learning and Inference Technologies for Intelligent Recommendation Services) investigated how machines can learn cognitive representations directly from streams of digital video, drawing inspiration from how the human brain constructs episodic memory and conceptual knowledge.

Digital videos provide an excellent learning substrate for teaching machines. The project used an IPTV-like game platform and EEG to study human visual and linguistic memory in a learning-by-viewing paradigm. The findings informed the development of advanced machine learning techniques designed to simulate human learning and memory. In particular, the project explored nonparametric Bayesian architectures — such as dynamic hypernetworks — that learn from sequences of digital videos of unbounded size by self-organizing cognitive networks.

The overall goal was to build an intelligent recommendation system capable of continuously acquiring structured knowledge from video streams, much as a viewer naturally learns from television, and to leverage that knowledge for proactive, personalized content recommendation.

Research Team

Principal Investigator

Prof. Byoung-Tak Zhang (Seoul National University)

Researchers

Jin-San Yang
Jung-Woo Ha
Min-Oh Heo
Ha-Young Jang
Byoung-Hee Kim
Soo-Jin Kim
Ji-Hoon Lee
Je-Keun Rhee
Ho-Sik Seok
Hyunwoo Kim
Jin-Seok Nam
Woongchang Yoon
Bado Lee
Myung-Goo Kang

Contact

Kyung-min Kim (kmkim@bi.snu.ac.kr), +82-2-880-1847

Methodology

The project was structured across five annual phases, each building on the last:

Year 1 (2010) — Interactive Home Media Platform and Recommendation System

Designed a multimodal multigraph (MMG) framework for an interactive home media platform
Built a Videome engine capable of managing video archives exceeding 1,000 hours
Developed foundational algorithms for content recommendation using TV viewing logs
Applied probabilistic graphical models with EM-based learning and Cross-Entropy-based training

Year 2 (2011) — Content Learning and Channel Recommendation via Probabilistic Graphical Models

Developed MMG-based learning models using Bayesian inference
Designed automatic channel detection and recognition algorithms for multimodal content
Built channel recommendation systems using graphical model learning
Applied particle filtering and MCMC-based probabilistic analysis for viewer behavior modeling

Year 3 (2012) — Activity-Aware Interactive Recommendation via Probabilistic Graphical Models

Extended probabilistic Markov model-based activity inference
Developed multichannel switching and content recommendation algorithms
Applied Sequential Monte Carlo methods for viewer activity modeling
Used Importance Sampling for viewer command synthesis and generation

Year 4 (2013) — Lifelong Learning and Activity Inference for Interactive Recommendation

Built extended probabilistic graphical models for task-aware, activity-aware recommendation
Developed Continuous Lifelong recommendation learning frameworks via Sequential Bayesian Inference
Applied efficient analysis methods to large-scale interactive program guides

Year 5 (2014) — Proactive Recommendation Services and Optimization

Developed proactive service recommendation combining diverse context signals
Personalized and optimized recommendation models for specific service scenarios
Addressed cold-start problems through novel user modeling approaches
Extended models to incorporate new activity domains and wearable sensor data

Key Technical Contributions

Multimodal Multigraph (MMG): a unified probabilistic graphical framework representing multi-modal video content and viewer behavior
Dynamic Hypernetworks: nonparametric Bayesian architectures for lifelong, unbounded video stream learning
Deep Concept Hierarchy (DCH): a model enabling progressive, incremental abstraction of visual-linguistic knowledge from video sequences
Sequential Bayesian Inference for continuous, online adaptation to evolving viewer preferences
Demonstrated on children’s cartoon series (e.g., Pororo) as a large-scale cartoon video corpus for vision-language learning

Project Information

Field	Details
Full Title	Cognitive Machine Learning and Inference Technologies for Intelligent Recommendation Services
Duration	May 2010 – April 2015
Funding	National Research Foundation (NRF)
Principal Investigator	Prof. Byoung-Tak Zhang