Citizen Data Scientist, Jumbo Shrimp, and Other Descriptions That Make No
Okay, let me get this out there: I find the term “Citizen Data Scientist”
confusing. Gartner defines a “citizen data scientist as “a person who
creates or generates models that leverage predictive or prescriptive
analytics but whose primary job function is outside of the field of
statistics and analytics.”
While we teach business users to “think like a data scientist” in their
ability to identify those variables and metrics that might be better
predictors of performance, I do not expect that the business stakeholders are
going to be able to create and generate analytic models. I do not believe,
nor do I expect, that the business stakeholders are going to be proficient
enough with tools like SAS or R or Python or Mahout or MADlib to 1) create or
generate the models, and then 2) be profi... (more)
The Dean of the University of San Francisco School of Management, Elizabeth
Davis, recently asked me to sit on a Big Data panel at the Direct Sales
Association conference. I was given a 5-minute slot to “demystify” Big
Data to a non-technical group of about 1,000 people; to help them understand
where and how this thing called “Big Data” could help them.
Well if you know me, I can barely introduce myself in 5 minutes. But this was
particularly challenging for me, as I’m used to talking about Big Data with
organizations with at least some level of Big Data experience or
understanding (maybe they should get my second book – the “Big Data
MBA” – and start there!).
So I accepted the challenge, and here is what I said (and yes, I did it
within the 5-minute window).
Myth #1: Every Business needs a Big Data strategy.
Reality #1: You don’t need a Big Data strategy; you need ... (more)
Big Data Business Model Maturity Index and the Internet of Things (IoT)
Antonio Figueiredo (@afigueiredo) recently challenged me on twitter with an
interesting question: How would the Big Data Business Model Maturity Index
(BDBMMI) change to support the Internet of Things (IoT)? My hope is that the
BDBMMI would not need to change to support IoT. It is my hope that the BDBMMI
could be used to guide any industry that is going through a data and
analytics-driven transformation, such as what is happening to many industries
due to IoT.
Let’s see how one could use the BDBMMI to help organizations to exploit the
IoT. But before we start that exercise, let’s start with some key
The Big Data Business Model Maturity Index (BDBMMI) is a framework to measure
how effective an organization is at leveraging data and analytics to power
the business (see Figure 1). We ... (more)
I know that I’ve hit the big (data) time when concepts that I developed
start to appear as infographics. Today I am very proud to announce the
launching of the demystified Big Data Business Model Maturity Index (BDBMMI)
Through the usage of clear, simple language paired with practical examples
that illustrate each stage of the big data maturity journey, the goal of this
infographic is to demystify the BDBMMI – to make it easier for customers
(and readers) to understand what the BDBMMI is, and how to use it to
successfully leverage data and analytics to power their business models.
Figure 1: Big Data Business Model Maturity Index Infographic [CLICK TO
Today you’ll learn more about Dave’s story and how he navigated through
the stages of the Big Data Business Model Maturity Index and put big data to
work for his organization.
Before we get star... (more)
New Approaches for New Big Data Insights
by Melvin Greer
Business Intelligence has matured as a core competency necessary to sustain
competitive advantage. Organizations of every size and industry are
generating valuable data with each interaction, and that data can be
captured, analyzed, and turned into business insight. These organizations are
using analytics features like dashboards, advanced visualization, data
warehousing, and other technologies to achieve their strategic business
Many companies are taking a hybrid cloud approach to data analysis.
Leveraging a hybrid cloud environment as part of a big data analytics
strategy enables businesses to take advantage of cloud elasticity. This
allows organizations to process data across clusters of computers, enabling
analysis to occur across multiple cloud compute environments. As
organizations' need for... (more)
Ocarina's Carter George continued the conversation on backups, asking if the
conventional backup paradigm was obsolete, and if file copies could serve the
same purpose. As mentioned in our "What Is a Backup?" post, this is the same
question posed by EMC's Scott Waterhouse recently.
Putting Copies To The Test
George suggests a copy-based scenario: "Why not just move files that are
candidates for being backed up to a separate tier of storage, keeping them as
files in their native format, and organizing them in time coherent views?"
To determine whether this is truly a backup, let's apply our new rules to
determine when a copy becomes a backup:
A copy is, by definition, a copy of a set of data. This copy is not mentioned
as being protected or offline, which worries the IT admin in me. Could they
be overwritten or corrupted? Would they disappear along with the primary data... (more)
Some of both, apparently. A recent Ponemon Institute PCI-DSS Compliance
survey revealed that 71% of companies actually admitted that data security is
not a top priority and 55% say they are only protecting credit card data and
not other sensitive information like bank account info, social security
numbers and drivers license data.
Additional statistics show that a miniscule 28% of smaller companies
(501-1000 employees) are PCI-DSS compliant and around 70% of large companies
(>75,000 employees) say they meet the Regulations. The one that jumps out
for me is the small merchant stat. I understand that cost is a large factor
for smaller companies to be PCI compliant but just imagine how many companies
and industries that fall into the 501-1000 employee category.
And that doesn’t count all the even smaller ‘Family Owned’ restaurants,
auto repair shops or any other servi... (more)
R is an incredibly comprehensive statistics package. Even if you just look at
the standard R distribution (the base and recommended packages), R can do
pretty much everything you need for data manipulation, visualization, and
statistical analysis. And for everything else, there's more than 5000
packages on CRAN and other repositories, and the big-data capabilities of
Revolution R Enterprise.
As a result, trying to make a list of everything R can do is a difficult
task. But we've made an effort in this list of R Language Features, a new
section on the Revolution Analytics website. It's broken up into four main
sections (analytics, graphics and visualization, R applications and
extensions, and programming language features), each with their own
ANALYTICS Basic Mathematics Basic Statistics Probability Distributions Big
Data Analytics * Machine Learning Opt... (more)
Bob Gourley, editor of CTOvision as well as founder and CTO of Crucial
Point, LLC, was recently interviewed by WashingtonExec, where he shared his
views on emerging information technology, government needs, and Big Data.
The original article can be found here, and the interview is reproduced
How does Bob Gourley, founder and CTO of Crucial Point, LLCand Editor of the
popular tech blog CTOVision.com define big data?
WashingtonExec caught up with Gourley to talk about what he learned from his
time in government that has guided him in the private sector, what he
predicts will be “the next big thing” for the IT industry, and also gave
an update on the 2012 Big Data Solutions Awards.
Securing IP addresses, banking apps and predictive analytics were also
discussed in this interview.
WashingtonExec: Could you start out by telling us a little about your
There is no doubt that Big Data holds infinite promise for a range of
industries. Better visibility into data across various sources enables
everything from insight into saving electricity to agricultural yield to
placement of ads on Google. But when it comes to deriving value from data, no
industry has been doing it as long or with as much rigor as clinical
Unlike other markets that are delving into Big Data for the first time and
don't know where to begin, drug and device developers have spent years
refining complex processes for asking very specific questions with clear
purposes and goals. Whether using data for designing an effective and safe
treatment for cholesterol, or collecting and mining data to understand proper
dosage of cancer drugs, life sciences has had to dot every "i" and cross
every "t" in order to keep people safe and for new therapi... (more)
A capability model is a structure that represents the core abilities and
competencies of an entity (department, organization, person, system, and
technology) to achieve its objectives, especially in relation to its overall
mission and functions.
The Big Data Capability Model (BDCM) is defined as the key functionalities in
dealing with Big Data problems and challenges.
It describes the major features, behaviors, practices and processes in an
organization, which can reliably and sustainably produce required outcomes
for Big Data demands. BDCM consist of the following elements:
Collection: collect raw data, sources, formats, discovery, protocols, staging
ELT: extract, load and transform data Store: NoSQL repository, key-value,
column-based, document-oriented, graph, Hadoop, MPP, in-memory, cache
Integration: data move, messaging, consumption, access, connector Processing... (more)