The Way from Data to Information

Data Mining

Subscribe to Data Mining: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get Data Mining: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Top Stories

Quick quiz! What’s the first thing that comes to mind when you hear the following phrases? Artificial grass Artificial sweeteners Artificial flavors Artificial plants Artificial flowers Artificial diamonds and jewelry Artificial (fake) news These phrases probably evoke thoughts such as “fake,” “not real,” or even “shabby.” Artificial is such a harsh adjective. The word “artificial” is defined as “imitation; simulated; sham” with synonyms such as fake, false, mock, counterfeit, bogus, phony and factitious. The word “artificial” may not be the right term to use to describe “Artificial Intelligence,” because “artificial intelligence” is anything but fake, false, phony, or a sham. Maybe a better term is “Augmented Human Intelligence,” or a phrase that highlights both the importance of augmenting the human’s intelligence as well as to alleviate the fears that AI means ... (more)

In-Stream Processing | @CloudExpo @robinAKAroblimo #BigData #AI #BI #DX

Most of us have moved our web and e-commerce operations to the cloud, but we are still getting sales reports and other information we need to run our business long after the fact. We sell a hamburger on Tuesday, you might say, but don't know if we made money selling it until Friday. That's because we still rely on Batch processing, where we generate orders, reports, and other management-useful pieces of data when it's most convenient for the IT department to process them, rather than in real time. That was fine when horse-drawn wagons made our deliveries, but it is far too slow for today's world, where stock prices and other bits of information circle the world (literally) at the speed of light. It's time to move to In-Stream Processing. You can't - and shouldn't - keep putting it off. [Figure 1, courtesy of the Grid Dynamics Blog] This diagram may look complicate... (more)

What Business Leaders Need to Know About Machine Learning | @ThingsExpo #AI #ML #IoT #M2M

What Tomorrow's Business Leaders Need to Know About Machine Learning Sometimes I write a blog just to formulate and organize a point of view, and I think it’s time that I pull together the bounty of excellent information about Machine Learning. This is a topic with which business leaders must become comfortable, especially tomorrow’s business leaders (tip for my next semester University of San Francisco business students!). Machine learning is a key capability that will help organizations drive optimization and monetization opportunities, and there have been some recent developments that will place basic machine learning capabilities into the hands of the lines of business. By the way, there is an absolute wealth of freely-available material on machine learning, so I’ve included a sources section at the end of this blog for folks who want more details on machine lea... (more)

Reconciling Big Data & Data Residency in the Cloud

I recently blogged about “big data” and the value that big data analytics can bring to companies using this type of business intelligence to garner insights and develop competitive advantages. Numerous industry experts have highlighted ways that the cloud is actually enabling a transformation toward data mining, pattern recognition, and predictive analytics to enhance executive decision-making. For example, Booz|Allen|Hamilton’s December 2011 report, Massive Data Analytics and the Cloud, asserts that data cloud-based intelligence analysis will have an unprecedented, long-lasting, and far-reaching impact on business strategy development. ... (more)

Putting Things to Work in the "Internet of Things"

Connected cars, factory equipment and household products communicating over the Internet is increasingly becoming a reality – one that might soon elicit headlines like “Is the Internet of Things a big bust?” That’s because it’s one thing to connect a device to the Internet and direct data back to the manufacturer or service provider. It’s another, to derive new information from those data streams. The ability to analyze data in the IoT is critical to designing better products, predicting maintenance issues, and even improving quality of life. Understanding the Internet of Things The Internet is no longer just a network of people using computers and smart devices to communicate with each other. In the not too distant future, everything from the factory floor to a city street will be connected to the Internet. Three out of four global business leaders are exploring the... (more)

Internet of Things Maturity Model By @TonyShan | @ThingsExpo [#IoT]

Internet of Things (IoT) is booming. The “Software for the Internet of Things (IoT) Developer Survey” report, published by Embarcadero Technologies last month, shows that 77% of development teams will have IoT solutions in active development in 2015 with almost half (49%) of IoT developers anticipating their solutions will generate business impacts by the end of this year. IoT Maturity Model (IoTMM) is a qualitative method to gauge the growth and increasing impact of IoT capabilities in an IT environment from both business and technology perspectives. It comprises  a set of criteria, parameters and factors that can be used to describe and measure the effectiveness of the IoT adoption and implementation. Five levels of maturity are defined: Advanced, Dynamic, Optimized, Primitive, and Tentative (ADOPT). The definitions of these 5 levels are specified below: Level Desc... (more)

Big Data Business Model Maturity Index and IoT | @ThingsExpo #BigData #IoT #M2M #API #Wearables

Big Data Business Model Maturity Index and the Internet of Things (IoT) Antonio Figueiredo (@afigueiredo) recently challenged me on twitter with an interesting question: How would the Big Data Business Model Maturity Index (BDBMMI) change to support the Internet of Things (IoT)? My hope is that the BDBMMI would not need to change to support IoT. It is my hope that the BDBMMI could be used to guide any industry that is going through a data and analytics-driven transformation, such as what is happening to many industries due to IoT. Let’s see how one could use the BDBMMI to help organizations to exploit the IoT. But before we start that exercise, let’s start with some key definitions: The Big Data Business Model Maturity Index (BDBMMI) is a framework to measure how effective an organization is at leveraging data and analytics to power the business (see Figure 1). We ... (more)

Data Unification at Scale | @CloudExpo #BigData #DataLake #AI #Analytics

This term Data Unification is new in the Big Data lexicon, pushed by varieties of companies such as Talend, 1010Data, and TamR. Data unification deals with the domain known as ETL (Extraction, Transformation, Loading), initiated during the 1990s when Data Warehousing was gaining relevance. ETL refers to the process of extracting data from inside or outside sources (multiple applications typically developed and supported by different vendors or hosted on separate hardware), transform it to fit operational needs (based on business rules), and load it into end target databases, more specifically, an operational data store, data mart, or a data warehouse. These are read-only databases for analytics. Initially the analytics was mostly retroactive (e.g. how many shoppers between age 25-35 bought this item between May and July?). This was like driving a car looking at the ... (more)

Using Java Data Mining to Develop Advanced Analytics Applications

With the standardization of the Java Data Mining (JDM) API, Enterprise Java applications have been given predictive technologies. Data mining is a widely accepted technology used for extracting hidden patterns from data. It is used to solve many business problems like identifying cross-sell or up-sell opportunities for specific customers based on customer profiles and purchase patterns, predicting which customers are likely to churn, creating effective product campaigns, detecting fraud, and finding natural segments. More and more data mining algorithms are being embedded in databases. Advanced analytics, like data mining, is now widely integrated with applications. The objective of this article is to introduce Java developers to data mining and explain how the JDM standard can be used to integrate this technology with enterprise applications. Data Mining Functions... (more)

Kognitio Celebrates 20 Years of WX2 Data Warehousing Platform

CHICAGO, May 6, 2009 — Kognitio today announced the 20th anniversary of its WX2 analytical database.  Since its introduction, WX2 has been a constant source of innovation in the field of data warehousing. It was the first platform to offer large-scale data warehousing, the first to enable Data Warehousing as a Service (DaaS) for organizations and one of the first to move onto an entirely software-based model. Kognitio has been responsible for innovating and advancing numerous data warehousing concepts and practices that are considered the leading edge in the field of business intelligence, such as the ability to query hundreds of terabytes of information within seconds instead of weeks, and enabling companies of all sizes to more easily take advantage of advanced data analytics at lower cost. The announcement was made at The Data Warehousing Institute’s (TDWI) World... (more)

Breach Is The Word, Is The Word, Is The Word That You Heard

…to the tune of $6.6 Mil per-r-r Breach.  Yup – according to Ponemon Institute the average cost of a data breach is $6.6 million and they also report that it costs about $215 per compromised record (pdf).  McAfee estimates $1 trillion in losses yearly, due to data theft – that’s 10 to the 12th dollars.  Imagine if IT budgets could get that back? The past two years saw a significant increase in large scale attacks with the January 2007 TJX breach starting the massive flurry.  As of October 2007, TJX said that more than were 94 million accounts affected at a cost of over $256 million.  At the time it was the largest data loss incident to date.  The crooks kept it up, however.  Hannaford Grocers was hit Dec 2007 but they didn’t discover it until February 2008 and announced in March 2008 that 4.2 million cards had been exposed  leading to over 1800 cases of fraud.  In ... (more)