Monday, 15 July 2013

matlab - How to remove all cells which contain supersets of other cells? -


I am working in text mining. I have 23 sentences that I extracted from a text file, as well as 6 times the words extracted from the same text file.

For the words often, I created a 1D array that shows the words and they are in sentences. After that, I took the intersection to show which word has the remaining remaining words:

  occursetogether = cell (length (out1)); For Ii = 1: JJ = I + 1: decreases the length (out 1) for length (out 1) together {ii, jj} = intersection (out 1 {ii}, out 1 {j j}}; Finally celldisp (works together)  

The output is somehow like this:

  together [1] = 4 3 events [1, 2] = 1 in 4 occasions [1,3] = 4 3 [/ 1] above [1, 1] shows that the word number 1 occurs in 1 sentence 4 and 3 , [1,2] The word 1 and 2 shows the sentence in sentences 1 2 and 3 and so on.  

What I want to do is a fir The absorption technique is to be implemented, which will remove all the cells which are supersets of other cells. As we can see in [4, 3] [1, 1] is a subset of [1,2], so The opportunities should be removed together [1,2] entry and the output should be as follows:

  occurs then [1] = 4 3 [1] , 3] = 4 3  

Remember that all possible subsets of entries in the system should be checked.

I think this is the way you want:

  [ii, jj] = ndgrid (1: number (components together)); S = Cellphone (@ (x, y) all (egmember (x, y)), octetoggegener (ii), ocesesteogester (jj)); S = triu (S, 1); % // Count each pair only once, and get results of self-coupleing = wandering (~ any, 1)); Example 1 :  
  Events together {1,1} = [4 3] with components {1, 2} = [1 4 3] Events together {1,3} = [1 4 3 5]; Events together {1,4} = [1 4 3 5];  

returns

  & gt; & Gt; Celldisp (results) is removed with {1} = 4 3  

components {1,2} because it is a superset of "Including components {1,1} . one supported {1,3} is removed because it is a superset of one supported {1,2} . Events have been removed together {1,4} because it is a superset of one supported {1,3} .

Examples 2 :

Events simultaneously {1,1} = [10 20 30] with references {1,2} = [10 20 30]

returns

& gt; Celldisp (results) results {1} = 10 20 30

with components {1,2} will be removed Is because it is a superset of , 1} , but occursTogether {1,1} is not deleted even if it is a superset of <1,2} . This is compared to the previous set (the third line of code).


No comments:

Post a Comment