The Way from Data to Information

Data Mining

Subscribe to Data Mining: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get Data Mining: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Data Mining Authors: William Schmarzo, Robin Miller, Jnan Dash, Liz McMillan, Mark Ross-Smith

Related Topics: Artificial Intelligence Journal, Data Mining, SEO Journal, Innovations Software Technology, Open Web Magazine

Blog Post

Algorithms of the Intelligent Web

Machine learning algorithms for everyday tasks

I have recently finished writing the "Algorithms of the Intelligent Web" and it should hit the bookshelves in a few weeks. I would like to tell you what the book is about and why I wrote it -- to save some typing, hereafter, I will refer to the book as "AIW", "the AIW book", etc. The code for the book is hosted on Google Code here.

The AIW book includes topics from the areas of machine learning, data mining, statistics, and discovery in knowledge bases. The literature on these topics is vast but it is, almost exclusively, academic and heavy in mathematical jargon. Nevertheless, the main ideas of the algorithms can be grasped and used by nearly every software engineer with a minimum of mathematical formalism and a little bit of effort. In fact, one of the goals that I set for the book was to describe every algorithm without writing a single mathematical equation; a couple of equations managed to crawl up but c'est la vie, c'est la guerre ...

The code for the book is written 100% in Java. Most of it was written from scratch (by D. Babenko) -- notable exceptions are the use of Lucene and Drools, a.k.a. JBoss Rules. Perhaps, more important is the fact that the code was written so that it is perfectly legible and easy to follow through. For example, there are probably thousands of implementations for the Naive Bayes algorithm but how many of them can you read and understand? My bias aside, the Yooreeka Naive Bayes implementation (click here to view it) is as succinct and as clean as they come; follow the link to judge for yourself.

Over the past year, there have been two books of a similar nature. The first was entitled "Programming Collective Intelligence" by T. Segaran (O'Reilly) and the other "Collective Intelligence in Action" by S. Alag (Manning). I am certain that there will be many more! Empowering an application with the ability to "learn" from its interaction with the people that use it, or the systems that integrate with it, is something that nearly everyone should be able to do. The AIW book should take you one step closer in that direction.

The following is a list of examples that demonstrate the usefulness of intelligent algorithms:

  • Assessing mortgage risk
  • Creating recommendations just like those on Netflix and Amazon
  • Google's Pagerank
  • Discovering matches on social-networking sites
  • Organizing the discussions on your favorite news group
  • Selecting topics of interest from shared bookmarks
  • Leveraging user clicks to enhance the user experience on a web site
  • Categorizing emails (or any other text document) based on their content
  • Targeted advertising
  • Fraud detection
  • Stock price forecasting

The list is practically limitless. Every aspect of a software application can be enhanced through the techniques that are unraveled in AIW and the other books. So, have a look at these books and start leveraging the power of intelligent algorithms, they are here to stay!

More Stories By Babis Marmanis

Haralambos (Babis) Marmanis is a pioneer in the adoption of machine learning techniques for industrial solutions, and also a world expert in supply management. He has about twenty years of experience in developing professional software. Currently, he is the CTO at Emptoris, Inc. Babis received his Ph.D. in Applied Mathematics and Scientific Computing from Brown University. Aside from machine learning and data mining, his interest is on multi-tier, high performance, enterprise software.