




Summary
Our objective is to advance progress on making machines to process and learn from enormous amount of multimodality sensing data, similar to or beyond what human beings can do. In order to organize, realize and extract intelligence out of the huge amount of continuous multimodality streams that a person, a company, or a community receives, we consider real-world machine learning and distributed signal processing as the keys. Real-world machine learning focuses on multimodality machine learning from imperfect and continuous sensing data. Distributed signal processing allows machine to classify large-scale data based on the learned models. Future machine cognition techniques have to be scalable, autonomous, and general enough to be able to handle diverse types of data, such as video, text, speech, sensor signals, human activity logs, etc.
Firstly the recent progress on autonomous learning will be presented, whose goal is to avoid human involvement on learning concept models. A step beyond the semi-supervised learning, our approach is to explore the weak correlations between multimodality representations of an instance, event or an activity. In our experiments, this approach demonstrated comparable results with the supervised learning methods in the application of visual concept learning.
One of our other goals is to build a distributed system to semantically monitor large-scaled multimodality streams. A key design requirement is the ability to handle tens of gigabytes of multimedia data per second. I will introduce our distributed framework which includes the hardware implementation of smart video camera, the algorithmic improvement of speeding up SVM classification in the order of 10s to 1000s, and the software improvement on the feature extraction.
Biography
Dr. Ching-Yung Lin received his Ph.D. degree from Columbia University in Electrical Engineering. Since Oct 2000, he has been a Research Staff Member in IBM T. J. Watson Research Center, New York, where he is currently leading projects on the IBM Large-Scale Video Semantic Filtering System. He is also an Adjunct Associate Professor at Columbia University and an Affiliate Associate Professor at the University of Washington, Seattle.
His research interest is mainly focused on multimodality signal processing and understanding, with applications on distributed computing, embedded vision system, social computing, and security. Dr. Lin led the first large-scale video semantic annotation project, which includes 23 worldwide research institutes in 2003. His multimedia semantic mining project team has performed best in the US National Institute of Standards and Technology (NIST) semantic video concept detection benchmarking since 2002. Dr. Lin is the Editor of the Interactive Magazines (EIM) of the IEEE Communications Society (2004-2006), an Associate Editor of the IEEE Trans. on Multimedia (2004-), and an Editorial Board Member of the Journal of Visual Communication and Image Representation (2005-). He served as a Guest Editor of the Proceedings of IEEE -- Special Issue on Digital Rights Management, June 2004, a Guest Editor of the EURASIP Journal on Applied Digital Signal Processing -- Special Issue on Visual Sensor Network, 2006, and the Technical Program co-chair of IEEE ITRE 2003. Dr. Lin is a recipient of 2003 IEEE Circuits and Systems Society Outstanding Young Author Award and IBM Invention Achievement Awards in 2001 and 2003. He is the (co-)author of 100+ journal articles, conference papers, book, book chapters, and public release software. Dr. Lin is a Senior Member of IEEE, and a member of ACM, INSNA and AAAS.



