CS601 - Independent Study
  Instructor: Prof. S. Muthu Muthukrishnan

Topic: Data Mining on Streams

With the vast development of Internet Techonology, there is large amount of data transmitting online with very fast speed. Traditional data mining algorithms can no long satisfy our new efficiency and accuracy requirements. Since the data generating speed is much faster than the traditional mining speed, we strongly feel that we have no enough time to deal with all these data, also our unbounded memory requirements can no long fit in the limited memory space. All such constraints stimulate us to find out more appropriate algorithms to mininig on streams with constant memory and within constant time. And we have to require all data can be seen at most once. Here algorithm efficiency is very important for us. All coming data may represent a very complex model. If we only use samples, we may only get very simple model, which cannt represent the real currently accurate model. 

And for time-changing data, concept may drift with the time being. This requires our streaming model can detect all such changes, and refine the model simultaneously. 

Reading List:

Traditional Learning Algorithms

Online Learning Algorithms Streaming Algorithms
 
Stat583 -METH STAT INF
  Instructor: Prof. Patrick Rojas