|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Something altogether different?
Thanks Ken. Comments inline. From: Ken North [mailto:kennorth@s...] >> So where we do understand how the vector model >> works for text analysis, do we understand how to apply >> it to a *text* that includes video and audio as integral >> parts of the *text* and can we combine these into a >> higher level space vector term >1. "Facilitating Video Access by Visualizing Automatic Analysis" >http://www.fxpal.com/publications/FXPAL-PR-99-045.pdf >"Metadata for video materials can be derived from the analysis of the audio and >video streams. For audio, we identify features such as silence, applause, and >speaker identity. For video, we find features such as shot boundaries, >presentation slides, and close-ups of human faces." This comes closest to what I would consider a good start as it uses gestural significators. It is in combination with other vocabularies that I think there is more bang for the buck. If similarity metrics apply across the vocabularies (a gesture in any vocabulary gets the same vm signature), then the cues all reinforce each other and the intepretation probability goes up. Of course, the problem of interpretation starts as soon as we assign metadata, begin to reason over that and it amplifies (self reinforcing assumptions - or GIGO). The reason for comparison is to detect superstition or simply, garbage in the metadata induced by faulty observation. Tough problem. >2. Yahoo has recently taken the RSS approach. Video RSS provides a text >description such as height, width, bitrate and running time: >http://www.webservicessummit.com/Channels/WebServicesSummitAudioVideo.rss Which is ok for knowing something about the coffee cup but not the coffee. >3. SQL implementations such as DB2 UDB support content-based querying over rich >types. DB2 has an Image Extender and Audio Extender with correspondiong types >(DB2IMAGE, DB2AUDIO). The Audio Extender analyses the content and stores values >such as whether it's 16-bit audio, samples per second, playing time, the number >of clock ticks per quarter note and so on. The Image Extender stores information >that enables you to provide an image and search for matches based on color and >texture (contrast, directionality, etc.). >IBM's CueVideo software uses speech recognition technology to generate text from >the audio tracks of videos -- which could then be fed into an engine that uses >the vector space model and textual similarity matching described in my previous >message: >http://www.almaden.ibm.com/projects/data/CueVideo.pdf Yes. A useful source. I suspect aided by the human eye, this is good. We have similar products (Video Analyst) that also enhance images. >4. This paper discusses analysis of digital music using similarity matrices. >Media Segmentation using Self-Similarity Decomposition >http://www.fxpal.com/people/cooper/Papers/SPIE02.pdf >"....In this example, the sequence ABC is asignificantly shorter summary >containing essentially all the information in the song." As a songwriter and producer, I can refute that. Let's just say that we work our bunnies off to make that wrong, but sometimes don't. The self-similarity of music is a cosmic d'oh. "To play with only this and that old hat is such a bore, but I sadly fear the love of the ear is to hear what it heard before." Digital editing makes it easy to grind out a self-similar production and it seduces one into not doing it by eliminating serendipitous opportunities and the fecundity of breath. >3.1. Clustering via similarity matrix decomposition >To cluster the segments, we factor a segment-indexed similarity matrix to find >repeated or substantially similar groups of segments." Which means the musician/songwriter succeeded and failed. Much of what I've learned over the years does come from my night gig, and yes, similarity plays a big role. On the other hand, so does 'new and different' and there is the length and tension of the performer's tightrope. We depend on that for repeat business and it depresses us because it isn't just one thing after another, but the same thing after another. whinge.... len
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








