Apr 26
2024
04/30(二)_Be a Thinker: Prepare for the New Era of Generative AI
webman
題 目:Be a Thinker: Prepare for the New Era of Generative AI
時 間:113年04月30日(二)下午1點00分至2點10分
地 點:台達館104教室
Abstract:
We are in the beginning of the generative AI era where much of our society will undergo fundamental changes. There will be many new opportunities for everyone, but also lots of ncertainties that could be difficult to understand. Being able to think through issues deeply would be more important than ever. I would encourage practicing thinking and suggest some techniques.
Speaker:
H. T. Kung is William H. Gates Professor of Computer Science and Electrical Engineering at Harvard University. He conducts research on topics related to the application of artificial intelligence in manufacturing and healthcare, AI accelerators, VLSI design, high-performance computing, parallel and distributed computing, computer architectures, and computer networks.
Jan 03
2024
01/08(一)_ Learning Visual Perception that Foundation Models Haven’t Learned_Speaker: Dr. Tsung-Wei Ke (CMU)
webman
題 目:Learning Visual Perception that Foundation Models Haven’t Learned
時 間:113年01月08日(一)下午13點30分至14點30分
地 點:台達館613會議室
Abstract:
Foundation models have achieved significant success in computer vision research. Vision-Language Models (VLMs) seem promising to address recognition challenges. VLMs excel at picking out fine-grained semantics, including those unseen during training. The Segment Anything Model (SAM) is a strong and general image parser, which generates segmentation masks of any given prompt. However, these models have not fully solved visual perception. Recognition models, including VLMs, are not robust to distribution drifts, caused by factors such as occlusion, sensorial noise, and domain gap. Segmentation models, including SAM, fail to discover and localize parts from the whole in the image. Pixel groupings are often inconsistent at different segmentation granularities. In this talk, I will present our recent works that address both challenges. First, we combined the state-of-the-art image generative models and recognition models. By optimizing the recognition models with the generative objectives, we improved the recognition of out-of-distribution images. Lastly, we introduced unsupervised hierarchical image segmentation frameworks, which generate consistent pixel groupings across segmentation hierarchy.
Bio:
Tsung-Wei is a postdoctoral researcher at CMU, working with Katerina Fragkiadaki. He obtained his Ph.D degree from UC Berkeley, working with Stella Yu. He is interested in the field of computer vision and embodied AI.
Dec 18
2023
12/25(一)_ Recent results on learning with diffusion models_Speaker: Ming-Hsuan Yang
webman
Speaker: Ming-Hsuan Yang
Time: 112/12/25(Mon.)10:00-11:30
Place: Delta R106
Abstract:
Diffusion models have been successfully applied to text-to-image generation with state-of-the-art performance. In this talk, I will discuss how these models can be used for low-level vision tasks and 3D scenes. First, I will present our findings on exploiting features from diffusion models and transformers for zero-shot semantic correspondence and other applications. Next, I will describe how we exploit diffusion models as effective prior for dense prediction, such as surface normal, depth, and segmentation. I will then discuss how diffusion models can f a c i l i t a t e a r t i c u l a t e d 3 D reconstruction, 3D scene generation, and novel view synthesis. When time allows, I will present other results on fine-grained text-to-image generation and pixel-wise visual grounding of large multimodal models.
Bio:
Ming-Hsuan Yang is a Professor at UC Merced and a Research Scientist with Google. He received the Google Faculty Award in 2009 and CAREER Award from the National Science Foundation in 2012. Yang received paper awards at UIST 2017, CVPR 2018, ACCV 2018, and Longuet-Higgins Prize in CVPR 2023.
He is an Associate Editor-in-Chief of PAMI, Editor-in-Chief of CVIU, and Associate Editor of IJCV. He was the Program Chair for ACCV 2014 and ICCV 2019 and Senior Area Chair/Area Chair for CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML, IJCAI, and AAAI. Yang is a Fellow of the IEEE and ACM.
Dec 13
2023
12/15(五)_ What the History of AI Says about its Future: Designing with Non-Use in Mind_主講人:Dr. Jonnie Penn (Harvard University)
webman
題 目:What the History of AI Says about its Future: Designing with Non-Use in Mind
時 間:112年12月15日(五)上午10點00分至12點00分
地 點:台達館108教室
Abstract:
What the History of AI Says about its Future: Designing with Non-Use in Mind
Since the term ‘AI’ was introduced in the 1950s, it has been used to describe three entirely different schools of thought about the nature of machine intelligence. Why is this? This talk introduces the forgotten forces behind the origins of ‘AI.' These complex histories provide rich evidence with which to calibrate speculation about AI and AI Ethics in the decades ahead. I introduce one overlooked trend around non-use.
Questions for the students to think about in advance of the class:
How would knowing the intention of your user help you to design an AI system?
How could restraint (when a user chooses not to use a technology) help you design better AI systems?
Bio:
Dr Jonnie Penn, FRSA, is an Associate Teaching Professor of AI Ethics and Society at the University of Cambridge. He is a historian of technology, a #1 New York Times bestselling author, and public speaker.
Penn serves as a Faculty Affiliate at the Berkman Klein Center at Harvard University, a Research Fellow and Teaching Associate at the Department of History and Philosophy of Science, a Research Fellow at St. Edmund’s College and as an Associate Fellow at the Leverhulme Centre for the Future of Intelligence.
He was formerly a MIT Media Lab Assembly Fellow, Google Technology Policy Fellow, Fellow of the British National Academy of Writing and popular broadcaster.
Dec 08
2023
12/14(四)_ 電腦視覺的應用與挑戰—以iPhone電影模式為例_主講人:鄭元博 副理 (聯詠科技)
webman
題 目:電腦視覺的應用與挑戰—以iPhone電影模式為例
時 間:112年12月14日(四)下午3點30分至5點00分
地 點:台達館108教室
Abstract:
• 藉由實際範例,介紹電腦視覺演算法,如何應用於攜帶式產品上。
• 嘗試讓同學了解,如何將所學課程與業界需求進行連結。
學/經歷:
學歷:
• 國立清華大學資訊工程博士
經歷:
• 2013-2018,工研院資通所工程師。
• 2018至今,聯詠科技副理,負責視訊處理演算法開發。
目前帶領20人的演算法團隊,負責規劃事業部的新技術、新產品發展方向。
Nov 29
2023
12/07(四)_Computer Vision: A Journey of Pursuing 3D World Understanding_Prof. Xiaoming Liu (Michigan State University)
webman
題 目:Computer Vision: A Journey of Pursuing 3D World Understanding
時 間:112年12月07日(四)下午15點30分至17點30分
地 點:台達館108教室
Abstract: We are living in a 3D world. When a camera takes a picture or video, many of the 3D information inevitably get lost due to the camera projection. As one of the most active fields in AI, computer vision aims to develop algorithms that can derive meaningful information from the visual content. One fundamental quest of computer vision is to recover the 3D information, and thus enables a faithful 3D understanding of the world through the lens of the camera. In this talk, I will share some of our experiences in pursuing the 3D world understanding, addressing problems such as 3D reconstruction, 3D detection, depth estimation, velocity estimation, etc. The solutions to these problems have been applied to applications including biometrics, autonomous driving, and digital human/face. In the end, I will also briefly overview other research efforts in the Computer Vision Lab at Michigan State University, such as AIGC for vision tasks, anti-deepfake, anti-spoofing, etc.
Bio: Dr. Xiaoming Liu is the MSU Foundation Professor, and Anil and Nandita Jain Endowed Professor at the Department of Computer Science and Engineering of Michigan State University (MSU). He received Ph.D. degree from Carnegie Mellon University in 2004. He works on computer vision, machine learning, and biometrics especially on face related analysis and 3D vision. Since 2012 he helps to develop a strong computer vision area in MSU who is ranked top 15 in US according to the 5-year statistics at csrankings.org. He has been Area Chairs for numerous conferences, the Co-Program Chair of BTAS’18, WACV’18, IJCB’22 and AVSS’22 conferences, and Co-General Chair of FG’23 conference. He is an Associate Editor of IEEE Transactions on Pattern Analysis and Machine Intelligence. He has authored more than 200 scientific publications, and has filed 29 U.S. patents. His work has been cited over 20000 times according to Google Scholar, with an H-index of 73. He is a fellow of The Institute of Electrical and Electronics Engineers (IEEE) and International Association for Pattern Recognition (IAPR).
Nov 22
2023
11/24(五)__開源與人工智慧發展:法律角度的觀察_主講人:侯宜秀 律師 (台灣人工智慧學校秘書長)
webman
題 目:開源與人工智慧發展:法律角度的觀察
時 間:112年11月24日(五)上午10點10分至12點00分
地 點:台達館108教室
Bio:
Secretary General, Taiwan Ai Academy Foundation. Isabel Hou is a seasoned attorney focusing on technological innovation and intellectual property law and has served as a legal counsel for various government programs, prestigious companies, and NGOs in Taiwan since 2000. Isabel became a solo practitioner after leaving Lee and Li, attorneys-at-law, however, collaborates with fellow lawyers and professionals from various backgrounds on a project basis regularly ever since.
She is currently the Taiwan AI Academy Secretary General and is leading the AI Civic Forum project. Isabel served as a committee member of Taiwan’s Open Parliament Multi-stakeholder Forum from 2019-2022.
Nov 08
2023
11/10(五)_Toward Foundation AI Models in Smart Manufacturing_主講人:陳維超 數位長暨資深副總經理 (英業達股份有限公司)
webman
題 目:Toward Foundation AI Models in Smart Manufacturing
時 間:112年11月10日(五)上午10點10分至12點00分
地 點:台達館108教室
Abstract
The primary challenges of using AI in smart manufacturing include scope change, verification difficulty, and transfer quality. Ill-defined requirements often result in shifts in data collection scopes and concepts. The rarity of real manufacturing failures poses problems in verifying model quality. The need to scale out model deployment also implies stringent conditions for domain transfer. This talk discusses our recent progress and observations in these respective areas. In particular, we focus on technologies regarding the trustworthy exchange of datasets, which lays the foundations for models that can be widely applicable to various application scenarios in robotics, contactless sensing, and visual inspection.
Short Bio:
Wei-Chao Chen is the Chief Digital Officer and Senior Vice President at Inventec Corp., a tier-one electronics company, and the Chairman at Skywatch Innovation, a developer for cloud-based IoT and video products. Dr. Chen is also a Visiting Professor at the National Taiwan University. His research interests include graphics hardware, computational photography, augmented reality, and computer vision. Dr. Chen was the Chief AI Advisor at Inventec between 2018-2020, an adjunct faculty at the National Taiwan University between 2009-2018, a senior research scientist in Nokia Research Center at Palo Alto between 2007-2009, and a 3D Graphics Architect in NVIDIA between 2002-2006. Dr. Chen received his MS in Electrical Engineering from National Taiwan University (1996), and MS (2001) and Ph.D. (2002) in Computer Science from the University of North Carolina at Chapel Hill.
Oct 31
2023
11/3(五)_Generating Moral Machines_主講人:Prof. Shao-Man Lee (Miin Wu School of Computing, National Cheng Kung University)
webman
題 目:Generating Moral Machines
時 間:112年11月03日(五)上午10點10分至12點00分
地 點:台達館108教室
Abstract
This speech investigates the capabilities of large language models, specifically GPT-3.5 in accurately representing human moral judgments across diverse cultures. Using the Moral Machine experiment as a testbed, her research examines GPT-3.5 ’s decision making under varying cultural contexts.
While results indicate GPT-3.5 exhibits some ability to approximate human moral inclinations, significant discrepancies remain compared to experimental data, especially regarding nuanced cultural preferences. Her research highlights challenges for AI in precisely replicating complex and variable human moral expression. It underscores the need for incorporating heterogeneous values into model training to better portray inclusive, multi-faceted global decision-making.
Short Bio:
Shao-Man Lee is an assistant professor who applies computational techniques to study socio legal issues. With a background in law, she utilizes natural language processing methods to elucidate topics ranging from judicial behavior to risk communication during the COVID-19 pandemic.
She is also involved in creating open datasets and models, such as Traditional Chinese legal named entity resources, to advance legal computation. Through an interdisciplinary approach that synthesizes law, social sciences, and AI, her research aims to provide data driven insights into legal and social phenomena.
Oct 20
2023
10/31(二)_Music-conditioned pluralistic dancing and a multi-camera system_主講人:Prof. Sanghoon Lee (Yonsei University, Korea)
webman
題 目:Music-conditioned pluralistic dancing and a multi-camera system
時 間:112年10月31日(二)下午15點30分至17點30分
地 點:台達館108教室
Abstract
In this talk, I would like to present “music-conditioned pluralistic dancing” and what we have done in our lab. toward building a multi-camera system for future research. When coming up with phrases of movement, choreographers all have their habits as they are used to their skilled dance genres. Therefore, they tend to return certain patterns of the dance genres that they are familiar with. What if artificial intelligence could be used to help choreographers blend dance genres by suggesting various dances, and one that matches their choreographic style? Numerous task-specific variants of autoregressive networks have been developed for dance generation. Yet, a serious limitation remains that all existing algorithms can return repeated patterns for a given initial pose sequence, which may be inferior. To mitigate this issue, we proposed MNET, a novel and scalable approach that can perform music-conditioned pluralistic dance generation synthesized by multiple dance genres using only a single model. Here, we learned a dance genre aware latent representation by training a conditional generative adversarial network leveraging Transformer architecture. After demonstration of the dancing, I would like to introduce the effort of our labs. for implementation of our multi-camera system. From the camera-system, we can expect numerous possibilities of developing core technologies in the fusion research areas of combining computer vision and computer graphics.
Short Bio:
Sanghoon Lee is a Professor at the EE Department, Yonsei University, Korea. His current research interests include image processing, computer vision, and graphics. He was an Associate Editor of the IEEE Trans. on Image Processing from 2010 to 2014. He served as a Guest Editor for the IEEE Trans. on Image Processing in 2013. He was the General Chair of the 2013 IEEE IVMSP Workshop. He has been serving as the Chair of the IEEE P3333.1 Working Group since 2011. He served as an Associate Editor for the IEEE SPL from 2014 to 2018, and a Senior Area Editor of the IEEE SPL from 2018 to 2022. He was the IEEE IVMSP/MMSP TC (2014–2019)/(2016–2021) and the IVM TC Chair of APSIPA from 2018 to 2019. He has been serving as an Associate Editor of IEEE Trans. on Multimedia and a member of the Senior Editorial Board of the IEEE Signal Processing Magazine from 2022. He is a Board of Governors member of APSIPA, and also an Editor in Chief of APSIPA News Letters.
Contact
- 賴尚宏老師
- 國立清華大學資訊工程學系電腦視覺實驗室
- 新竹市光復路二段101號台達館719,720,721
- 電話: (03)5715131 720:分機80932, 721:分機80933
- Dr. Shang-Hong Lai
- Computer Vision Lab
- Department of Computer Science, National Tsing Hua University
- Rooms 719,720,721 , Delta Building , No. 101, Section 2, Kuang-Fu Road, Hsinchu, Taiwan 30013, R.O.C.
- TEL: (03)5715131 720:#80932, 721:#80933
Directions
- 台達館 Delta Building
進入清大校園後直行,經過大草坪後右轉,再往前直走,經過教育館後,看到工三館左轉即可到達台達館 - GPS座標
北緯:24.79591 東經:120.99211