MARS
A Multimodal Associative Recommendation System
MARS developed a multimodal associative recommendation system that simulates human cognitive memory — specifically crossmodal associative recall between vision and language — to provide personalized content recommendations in internet and mobile environments.
Overview
Recommendation underlies many internet and web services. The MARS project developed novel recommendation techniques that simulate human cognitive memory, specifically crossmodal associative recall between vision and language. Machine learning techniques were used to convert between image and text modalities using a corpus of articles containing images. Combined with user lifelog and social data, this technology provides personalized crossmodal recommendation services in a mobile environment using smartphones and tablets.
The full project title was Multimodal Information Extraction and Recommendation Technologies for Next-Generation Customized Services Based on Machine Learning, reflecting a five-year phased program building from low-level multimodal information extraction toward a fully integrated adaptive recommendation system (MARS).
Research Team
Principal Investigator
- Prof. Byoung-Tak Zhang (Seoul National University, Biointelligence Lab)
Co-Investigator
- Prof. Nam Ik Cho (Seoul National University, ISPL Laboratory)
Researchers
- Byoung-Hee Kim
- Jung-Woo Ha
- WoongChang Yoon
- Seong-Jong Ha
- Bado Lee
Contact: Byoung-Hee Kim (bhkim -at- bi.snu.ac.kr)
R&D Objectives (Year by Year)
The project was structured as a five-year program with evolving annual objectives:
| Year | Subtitle | Key Objectives |
|---|---|---|
| 2009 | Research on Information Extraction in Multimodal Richmedia | Attribute definition and relation summarization of complex image/movie/text data; framework for compounding descriptors; mutual generation using image-text cross-modal machine learning |
| 2010 | Research on Context-based Information Extraction in Richmedia | Context-based compound information extraction and descriptor generation; cross-modal context analysis in images and movies; multimodal topic modeling; testing of online article-mall connection system |
| 2011 | Research on Information Extraction of Richmedia in Dynamic Environments | Learning and modeling compound information in time/space-varying data; interactive incremental analysis; incremental social analysis via multimodal topic models applied to microblog analysis |
| 2012 | Research on Recommendation Methods based on Multimodal Associativity | Multimodal-associative modeling of user preferences; interactive recommendation in dynamic richmedia; multimodal interactive article-mall system with user preference and context recognition |
| 2013 | Development of MARS and Its Application to Adaptive Recommendation | Recommendation engine based on multimodal association and user preference modeling; constructing the full MARS framework; personalized/adaptive richmedia recommendation system |
Technical Approach
- Layered hypernetwork architecture for capturing higher-order cross-modal associations between text and image features
- Bidirectional cross-modal inference: text-to-image and image-to-text generation and retrieval
- Multimodal query expansion: enhancing retrieval using cross-modal semantic associations learned from magazine corpora
- Incremental and context-aware modeling of compound information in dynamic, time-varying richmedia
- User lifelog and social data integration for personalized recommendation in mobile contexts (smartphones, tablets)
Evaluation
The system was demonstrated on real Korean magazine corpora, showing improved content-based retrieval and recommendation accuracy over single-modality baselines. The article-mall connection system (Storysearch & Storyshop) provided a real-world deployment testbed for the MARS recommendation engine.
Collaboration
- ISPL Laboratory, Seoul National University (Prof. Nam Ik Cho) — collaborative research on signal processing and image analysis
- Digital Design House Co. Ltd — industry partner providing the Storysearch and Storyshop platforms as application testbeds
Publications
- J.-W. Ha, B.-H. Kim, B. Lee, and B.-T. Zhang, “Layered hypernetwork models for cross-modal associative text and image keyword generation in multimodal information retrieval,” Proceedings of the Eleventh Pacific Rim International Conference on AI (PRICAI 2010), 2010.
- M.-O. Heo, M.-G. Kang, and B.-T. Zhang, “Visual query expansion via incremental hypernetwork models of image and text,” Proceedings of the Eleventh Pacific Rim International Conference on AI (PRICAI 2010), 2010.
- J.-W. Ha, B.-H. Kim, H.-W. Kim, W. C. Yoon, J.-H. Eom, and B.-T. Zhang, “Text-to-image cross-modal retrieval of magazine articles based on higher-order pattern recall by hypernetworks,” The 10th International Symposium on Advanced Intelligent Systems (ISIS 2009), pp. 274-277, 2009. (Best Paper Award)
- H.-W. Kim, B.-H. Kim, and B.-T. Zhang, “Evolutionary hypernetworks for learning to generate music from examples,” IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2009), pp. 47-52, 2009.
- J.-W. Ha, J.H. Jang, D.-H. Kang, W.H. Jung, J.S. Kwon, and B.-T. Zhang, “Gender classification with cortical thickness measurement,” IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2009), pp. 41-46, 2009.
- J.-W. Son and S.-B. Park, “Learning word sense disambiguation in biomedical text with difference between training and test distributions,” ACM Third International Workshop on Data and Text Mining in Bioinformatics, 2009.
- S. Ju and K.-B. Hwang, “A weighting scheme for tag recommendation in social bookmarking systems,” ECML/PKDD Discovery Challenge 2009, 2009.
- J. Bootkrajang, S. Kim, and B.-T. Zhang, “Evolutionary hypernetwork classifiers for protein-protein interaction sentence filtering,” The Genetic and Evolutionary Computation Conference (GECCO 2009), pp. 185-191, 2009.
- J.-W. Ha, B.-H. Kim, B. Lee, and B.-T. Zhang, “Auto-tagging method for unlabeled item images with hypernetworks for article-related item recommender systems,” Journal of the Korean Institute of Information Scientists and Engineers: Computing Practices and Letters, 16(10):1010-1014, 2010.