Friday, 15 February 2013

xml - Advanced boolean search of JSON files containing speech-to-text data? -


I have automatic machine tape of hundreds of video and audio files. I have copy in five formats: JSON, XML, SRT , VTT, TXT. (Click to view example files.) JSON and XML files contain the most comprehensive data, including speaker id, confidence level, and timecode.

I'm looking for a way to find or search for this data to find words and phrases. I need to be able to submit a Boolean search query, then click on the result and play a video / audio file on the timecode of the text result are not the only required boolean operators, and (or just a Like online search engine). Example Search: ("Baseball Bat" and Park) or Soccer

I'm thinking of a simple interface.

Basic Options:

  • Search box
  • Minimum confidence level slider

Ideas for advanced options:

  • Speaker: "Bob, Jo, Bill" (i.e., the speaker should be one of these)
  • And maximum time allowed between words in search: XX seconds
  • Exact phrase search allowed maximum time between words: XX seconds
  • Exact phrase in search Words should speak: On / Off
  • Should be the same between the words and the speaker: on / off
  • should be the same speaker: on / off
  • and the word should be found between the chronological order: Ignore punctuation: On / Off

Just, I need something like Agent Ransom with timecode, and if possible, something Miscellaneous Options I know this is a very specific and complicated request. :) Can you give me a lead on this idea? I do not want to change the wheel again. Which software / command line program / engine closes closest to being able to do all this? Maybe I can customize it from there.

Thank you!

You can do a system over Solar / Lusen, however, to implement the necessary features You need to get more experience for

You can see

for the implementation of open source of speech archives and indexing

You can see the Matterhorn speech indexing You can find details

However, this is not the only way to implement such functionality, Choice of language and could increase even further with simple tools. Ruby / PHP or node. JS will also work here


No comments:

Post a Comment