|
|
|
Videotext Recognition For Sports Video Summarization
|
||||||||||||||||||||||||||
|
|
|
|
|
Automatically detecting and extracting the caption text regions in video with real-time speed |
|
|
Automatically detecting the video frames containing caption text, detecting the text key-frame with real-time speed |
|
|
Developing recognition modules, which are able to recognize the words and characters texts more accurately. |
|
|
Develop domain-specific video text detection and recognition techniques for sports video indexing, retrieval, and summarization |
Research challenges
Comparing with document text extraction and recognition, text detection and
recognition in video has following challenge issues:
|
|
Varied locations and layouts, which make text detection difficult |
|
|
Small text size, and image resolution, which make both recognition and detection difficult |
|
|
Variations including font, lighting etc, which make accurate text recognition very difficult |
|
|
Blurred or transparent characters, which make accurate text recognition very difficult |
State-of-the-Arts
Many researchers have studied the problem of text detection and recognition
in video:
|
|
Researchers at Michigan State University uses color quantization and connected component analysis to locate the texts in video frames and images. |
|
|
Researchers at CMU studied the superimposed caption detection and recognition in News video. They use edge-based approach to detect the text position; vertical projection profile and thresholding to segment the text lines; and template matching to recognize the characters. They obtain approximately 84% recognition precision in CNN news video. |
|
|
Researchers at Intel gives an approach for text detection in digital video, including movies, commercials and News. They combine the features of texture, color, contrast and motion to localize and extract the text patches, which are further filtered by rule-based approach. They use commercial recognition tools to recognize texts. Recognition accuracy is 41% to 76%. |
|
|
Researchers at Maryland University studied the text detection and recognition problem in digital videos including commercials, sports, News, movies, TV program. They use Haar wavelet, geometrical moments and neural network to locate the text area. They also use commercial recognition tools to recognize characters. Their experiments showed that without character enhancement the recognition accuracy of the commercial recognition tool is only 53%, after resolution enhancement, the recognition accuracy is about 88%. |
|
|
Other researchers in IBM research, Microsoft Research, Philips Researches etc. also investigated this problem. |