Language Detection from Speech: Chinese or English?

In language processing, it is an essential step to detect which language it is before speech recognition and machine translation. This blog post presents an approach to distinguish Chinese and English from speech (an audio sample) using a neural network model. Spark is used to perform data preprocessing, and TensorFlow ...

read more

Latent Dirichlet Allocation and Topic Modeling

When reading an article, we humans are able to easily identify the topics the article talks about. An interesting question is: can we automate this process, i.e., train a machine to find out the underlying topics in articles? In this post, a very popular topic modeling method, Latent Dirichlet ...

read more

Hidden Markov Model and Part of Speech Tagging

In a Markov model, we generally assume that the states are directly observable or one state corresponds to one observation/event only. However, this is not always true. A good example would be: in speech recognition, we are supposed to identify a sequence of words given a sequence of utterances ...

read more

Locating and Filling Missing Words in Sentences

Sat 05 Mar 2016 by Tianlong Song Tags Natural Language Processing

There has been many occasions that we have incomplete sentences that are needed to completed. One example is that in speech recognition noisy environment can lead to unrecognizable words, but we still hope to recover and understand the complete sentence (e.g., by inference); another example is sentence completion questions ...

read more