Vision, Speech and Signal Processing

20
Centre for V ision, Speech, and Signal Processing 1 Centre for Vision, Speech and Signal Processing Josef Kittler Implementing Future Networks, Content and Services with Secure and Efficient Systems

description

A review of the research, facilities and expertise of the Centre for Vision, Speech and Signal Processing. Presented at "Implementing Future Networks, Content and Services with Secure and Efficient Systems." At the University of Surrey 20th Sept 2010

Transcript of Vision, Speech and Signal Processing

Page 1: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 1

Centre for Vision, Speech and Signal Processing

Josef Kittler

Implementing Future Networks, Content and Services with Secure and Efficient Systems

Page 2: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 2

Centre for Vision, Speech, and Signal Processing

Focus: Multidimensional signal (speech, audio, images, volumetric data, video) processing, interpretation and understanding)

Themes: Biometrics, Visual Media, Video Archive Retrieval and Restoration, Security and Surveillance, Audio Perception, Robotics&Cognitive Vision, Medical Imaging, Multimedia Communications

Recent Highlights  Top Ranking in the 2008 UK RAE assessment

Centre statistics  110 people (60PhD, 30 RAs)

Centre for Vision, Speech and Signal Processing Prof J Kittler FREng, Director

Academic staff Prof A Hilton Prof J Illingworth Prof A Kondoz Dr R Bowden Dr K Mikolajczyk Dr K Wells Dr P Jackson Dr T Windeatt Dr W Wang Dr J Collomosse Dr J Calic Dr A Fernando Dr S Worrall

Page 3: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 3

Research Support

  JIF/SRIF Grants to provide a high-definition multi-camera broadcast studio and 3D display (£4.5m investment)

  Total grant portfolio of £8,5 million   Total EPSRC grant portfolio of £4,5 million   EPSRC Platform grant 2003-2013   EPSRC Basic Technology Grant in Medical

Imaging   EU Projects from the FP7 Cognitive Systems

Programme (success rate, 15%)   Industrial funding (10%)   Turnover (£4.5m)

Page 4: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 4

Visual Content Production (Professor Adrian Hilton)

Reconstruction of real-scenes from images

  3D Shape and Motion Capture

  Video-based animation of people

  3D video of faces

  3D Studio Production

  Free-viewpoint video of sports

Applications: Film, Broadcast, Games, Communication

Professor Adrian Hilton

Page 5: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing

Imagine being able to choose your view point with no restrictions.

Virtual camera

Page 6: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 6

Visually realistic talking face synthesis

Page 7: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 7

  Audio Perception   Retrieval   Quality assessment   Speech modelling   3D sound modelling   Blind source separation

Audio Perception

Institute of Sound Recording

Dr Jackson

Dr Wang

Page 8: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 8

Restoration

Original Flicker Compensated

Page 9: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 10

Video annotation and indexation

The system can work out the score directly from the video, as well as ball track, player actions…

Page 10: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing

Image/Video database retrieval

aircraf car mountain snow

cat chair person tv-monitor

airplane animal desert road

Dr Mikolajczyk

Dr Calic

Page 11: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing

Artistic rendering

  Artistic content creation from images and videos

12

Dr Collomosse

Page 12: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 14

Security and Surveillance

  Detection of people in natural scenes

  Object tracking

  Tracking over multiple cameras with non- overlapping views

  Human motion behaviour analysis and recognition

Page 13: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 16

User centric communication I-Lab

  Joint work in Real-time, multi-platform remote collaboration

  Simultaneous editing and sharing desktop documents, audio visual communication

Prof Kondoz

Page 14: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 17

Visual Interfaces and Interaction

  Feature tracking by on the fly learning

  Sign language recognition

  Expressive visual interfaces exhibited in museums (UK and USA)

Applications: HCI, Communication, Entertainment

Dr Bowden

Page 15: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 18

Secure access via biometrics

VOICE

FACE

LIPS

Fusion

3D ASSISTED 2D FACE RECOGNITION

3D FACE RECOGNITION

Page 16: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 19

Technology transfer

Hand-held 3D Scanner - Product with 3D Scanners UK

EU IT Grand Prize 1996

Manufacturing Industry 1997 ‘Design product of the year’

Computer Graphics World ‘Innovation award’ 1996

EU IT Prize 2001

3D Photo Booth Avatar-Me

Video editing system ‘mokey’ Imaginair

Spin out company

Page 17: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 20

CVSSP Facilities

  High-definition Multi-camera Broadcast Studio 100m2 chroma-key, 8 HD+14SD cameras

  Marker-based Human Motion Capture 4xCODA Active Marker Motion Capture

  3D Shape Capture 3D Scanners ModelMaker hand-held scanner

  4D Dynamic Face Capture video-rate capture of face shape and colour

  Video processing engine SDI digital video-rate processing (40CPU)

  Picture quality assessment facilities

Page 18: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing 21

Visualisation Facilities

  Active stereo, rear projected display (7.5x2.5m)

  340 speaker WFS 3D audio

Page 19: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing

The Esmono soundproof booth

22

Page 20: Vision,  Speech and Signal Processing

Centre for Vision, Speech, and Signal Processing

Conclusions

  CVSSP research is driven by   Scientific curiosity and challenge   Anticipated demands   Specific applications

  CVSSP has state-of-the-art expertise in multimedia content production, indexing, retrieval, restoration, communication and secure access

  Excellent facilities with open access   Good track record in technology transfer

23