Sunday 24 January 2016

Using Bayesian Network to learn word sequence patterns

We want to see how the stuff we talked about before (in the following posts) can help us to learn patterns. The example is about learning the way we use words in simple sentences; the sentences we usually use in our IM, chat, or the way babies start to build sentences. Here are the posts you may need to read them before if you are not familiar with Bayesian Network:

Using probability to predict state of a system
Introduction to Transition matrix and Superposition of states
More on Transition matrix and Superposition of states

How your cell phone learns your style of writing
You get a new cell phone and start using its messaging system. You enter sentences when you text your friends, and eventually see the cell phone starts suggesting words or guessing what you are going to type. The more you use the phone, the better its guesses will be.



Let's see how we can build a simple Bayesian Network based on the sentences you've already texted to guess the next word you are typing. Consider the following simple sentences, and suppose you've entered them before and now we want to see how the phone can build a Bayesian Network to predict your next word.

- hey good morning
- how are you
- good morning
- i am good
- how about you
- i am busy
- i am good too

We chose just these small number of short sentences because we want to have a simple graph for our network, something understandable. All we need to do is to have every word as a node of the graph and connect the words in order of their appearance in sentences. Score a connection as one whenever you see one for the first time or increase the score if it is already there.

So if we pars the given sentences from the top we have the following changes in scores of the edges of the network, and note we've assumed "-" as a token which shows the start of a new sentence:

-  =>(1)=>  hey
hey  =(1)=>  good
good  =(1)=>  morning
morning  =(1)=>  -
-  =>(1)=>  how
how  =(1)=>  are
are  =(1)=>  you
you  =(1)=>  -
-  =>(1)=>  good
good  =(2)=>  morning
morning  =(2)=>  -
-  =>(1)=>  i 
...

Now if you show all of these nodes and edges with their scores on a network graph, you will have the following.


A network for the given sentences with transition scores

Now, what basically, the above network says is that if you want to start a sentence, you need to go to the node "-". Now the history shows that you've already begun your sentence three times with "i", two times with "how", one time with "good" and another one time with "hey". So since the score of the "i" is more than the others is it more likely that you start your sentence with "i", if not, then "how" and ...

Now if you start your sentence with "i" then you surely gonna say "am", because the network contains no other path. After saying "am", you have two options and as the graph shows you are more likely going to say "good" than "busy" .... (you can see the live graph here)

If you calculate next steps probabilities and put them on the graph instead of the scores (on the edges), you are going to have the following:

A network for the given sentences 
with transition probabilities

This model is in fact like the way a human baby starts learning to speak; he/she doesn't know any grammar rules but soon he/she learns by listening how other people use words.

The following network shows the graph interpretation of some part of this post. And as you see the density of edges for some nodes are more, in another word they have more in/out connections than others like "the", "and", "we" ... these are the words I've used in this document more than the others. But generally, if you draw a network of my posts since my vocabulary is limited you can easily predict my next word based on the current word.

A network for part of the sentences
we have in this post.


Here we need to mention an important thing, the networks we had in this post were all based on just the previous word while we can have a better prediction if we build a network which shows the next word based on the two latest words. We will see later how we can do this too.

No comments:

Post a Comment