Computer Vision|Introduction
What is
- Make a computer understand images and “tell a story” like human beings.
- To bridge the gap between pixels and “meaning”.
- Vision as a source of semantic information.
Philosophy in human vision system
- 等级制的—->多尺度融合
- 中心偏置的—->正则化
- 显著性—->显著性检测
Covers
- compute 3D structure (shape and motion capture)
- recognition (对象检测、语义分割、图像描述、行为识别)
- Image enhancement (背景模糊、超分辨率重建、去噪、阴影去除、去模糊)
- Image editing (风格迁移、图像生成、图像修复、图像填补)
Application
- OCR (Optical Character Recognition)
- Face detection and analysis (smile detection…)
- Fingerprint/ face unlock
- recreation(例如,张嘴吐口红)
- Google maps: Annotate all houses and streets
- Amazon Go (supermarket)
- tracking 追踪
- autonomous vehicles
- robotics
- medical diagnosis
- vision-based interaction and games (运动手环)
- Augmented Reality (AR 增强现实)
- Virtual Reality
Challenges
- view variance
- weak lighting
- scale discrepancy
- Intra-class variance
- motion
- cluttered background
- occlusion
- blur
All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.