Building your data and analytics strategy

When it comes to being data-driven, organizations run the gamut with maturity levels. Most believe that data and analytics provide insights. But only one-third of respondents to a TDWI survey said they were truly data-driven, meaning they analyze data to drive decisions and actions.

Successful data-driven businesses foster a collaborative, goal-oriented culture. Leaders believe in data and are governance-oriented. The technology side of the business ensures sound data quality and puts analytics into operation. The data management strategy spans the full analytics life cycle. Data is accessible and usable by multiple people – data engineers and data scientists, business analysts and less-technical business users.

TDWI analyst Fern Halper conducted research of analytics and data professionals across industries and identified the following five best practices for becoming a data-driven organization.

1. Build relationships to support collaboration

If IT and business teams don’t collaborate, the organization can’t operate in a data-driven way – so eliminating barriers between groups is crucial. Achieving this can improve market performance and innovation; but collaboration is challenging. Business decision makers often don’t think IT understands the importance of fast results, and conversely, IT doesn’t think the business understands data management priorities. Office politics come into play.

But having clearly defined roles and responsibilities with shared goals across departments encourages teamwork. These roles should include: IT/architecture, business and others who manage various tasks on the business and IT sides (from business sponsors to DevOps).

2. Make data accessible and trustworthy

Making data accessible – and ensuring its quality – are key to breaking down barriers and becoming data-driven. Whether it’s a data engineer assembling and transforming data for analysis or a data scientist building a model, everyone benefits from trustworthy data that’s unified and built around a common vocabulary.

As organizations analyze new forms of data – text, sensor, image and streaming – they’ll need to do so across multiple platforms like data warehouses, Hadoop, streaming platforms and data lakes. Such systems may reside on-site or in the cloud. TDWI recommends several best practices to help:

  • Establish a data integration and pipeline environment with tools that provide federated access and join data across sources. It helps to have point-and-click interfaces for building workflows, and tools that support ETL, ELT and advanced specifications like conditional logic or parallel jobs.
  • Manage, reuse and govern metadata – that is, the data about your data. This includes size, author, database column structure, security and more.
  • Provide reusable data quality tools with built-in analytics capabilities that can profile data for accuracy, completeness and ambiguity.

3. Provide tools to help the business work with data

From marketing and finance to operations and HR, business teams need self-service tools to speed and simplify data preparation and analytics tasks. Such tools may include built-in, advanced techniques like machine learning, and many work across the analytics life cycle – from data collection and profiling to monitoring analytical models in production.

These “smart” tools feature three capabilities:

  • Automation helps during model building and model management processes. Data preparation tools often use machine learning and natural language processing to understand semantics and accelerate data matching.
  • Reusability pulls from what has already been created for data management and analytics. For example, a source-to-target data pipeline workflow can be saved and embedded into an analytics workflow to create a predictive model.
  • Explainability helps business users understand the output when, for example, they’ve built a predictive model using an automated tool. Tools that explain what they’ve done are ideal for a data-driven company.

4. Consider a cohesive platform that supports collaboration and analytics

As organizations mature analytically, it’s important for their platform to support multiple roles in a common interface with a unified data infrastructure. This strengthens collaboration and makes it easier for people to do their jobs.

For example, a business analyst can use a discussion space to collaborate with a data scientist while building a predictive model, and during testing. The data scientist can use a notebook environment to test and validate the model as it’s versioned and metadata is captured. The data scientist can then notify the DevOps team when the model is ready for production – and they can use the platform’s tools to continually monitor the model.

5. Use modern governance technologies and practices

Governance – that is, rules and policies that prescribe how organizations protect and manage their data and analytics – is critical in learning to trust data and become data-driven. But TDWI research indicates that one-third of organizations don’t govern their data at all. Instead, many focus on security and privacy rules. Their research also indicates that fewer than 20 percent of organizations do any type of analytics governance, which includes vetting and monitoring models in production.

Decisions based on poor data – or models that have degraded – can have a negative effect on the business. As more people across an organization access data and build  models, and as new types of data and technologies emerge (big data, cloud, stream mining), data governance practices need to evolve. TDWI recommends three features of governance software that can strengthen your data and analytics governance:

  • Data catalogs, glossaries and dictionaries. These tools often include sophisticated tagging and automated procedures for building and keeping catalogs up to date – as well as discovering metadata from existing data sets.
  • Data lineage. Data lineage combined with metadata helps organizations understand where data originated and track how it was changed and transformed.
  • Model management. Ongoing model tracking is crucial for analytics governance. Many tools automate model monitoring, schedule updates to keep models current and send alerts when a model is degrading.

In the future, organizations may move beyond traditional governance council models to new approaches like agile governance, embedded governance or crowdsourced governance.

But involving both IT and business stakeholders in the decision-making process – including data owners, data stewards and others – will always be key to robust governance at data-driven organizations.

SAS1

There’s no single blueprint for beginning a data analytics project – never mind ensuring a successful one.

However, the following questions help individuals and organizations frame their data analytics projects in instructive ways. Put differently, think of these questions as more of a guide than a comprehensive how-to list.

1. Is this your organization’s first attempt at a data analytics project?

When it comes to data analytics projects, culture matters. Consider Netflix, Google and Amazon. All things being equal, organizations like these have successfully completed data analytics projects. Even better, they have built analytics into their cultures and become data-driven businesses.

As a result, they will do better than neophytes. Fortunately, first-timers are not destined for failure. They should just temper their expectations.

2. What business problem do you think you’re trying to solve?

This might seem obvious, but plenty of folks fail to ask it before jumping in. Note here how I qualified the first question with “do you think.” Sometimes the root cause of a problem isn’t what we believe it to be; in other words, it’s often not what we at first think.

In any case, you don’t need to solve the entire problem all at once by trying to boil the ocean. In fact, you shouldn’t take this approach. Project methodologies (like agile) allow organizations to take an iterative approach and embrace the power of small batches.

3. What types and sources of data are available to you?

Most if not all organizations store vast amounts of enterprise data. Looking at internal databases and data sources makes sense. Don’t make the mistake of believing, though, that the discussion ends there.

External data sources in the form of open data sets (such as data.gov) continue to proliferate. There are easy methods for retrieving data from the web and getting it back in a usable format – scraping, for example. This tactic can work well in academic environments, but scraping could be a sign of data immaturity for businesses. It’s always best to get your hands on the original data source when possible.

Caveat: Just because the organization stores it doesn’t mean you’ll be able to easily access it. Pernicious internal politics stifle many an analytics endeavor.

4. What types and sources of data are you allowed to use?

With all the hubbub over privacy and security these days, foolish is the soul who fails to ask this question. As some retail executives have learned in recent years, a company can abide by the law completely and still make people feel decidedly icky about the privacy of their purchases. Or, consider a health care organization – it may not technically violate the Health Insurance Portability and Accountability Act of 1996 (HIPAA), yet it could still raise privacy concerns.

Another example is the GDPR. Adhering to this regulation means that organizations won’t necessarily be able to use personal data they previously could use – at least not in the same way.

5. What is the quality of your organization’s data?

Common mistakes here include assuming your data is complete, accurate and unique (read: nonduplicate). During my consulting career, I could count on one hand the number of times a client handed me a “perfect” data set. While it’s important to cleanse your data, you don’t need pristine data just to get started. As Voltaire said, “Perfect is the enemy of good.”

6. What tools are available to extract, clean, analyze and present the data?

This isn’t the 1990s, so please don’t tell me that your analytic efforts are limited to spreadsheets. Sure, Microsoft Excel works with structured data – if the data set isn’t all that big. Make no mistake, though: Everyone’s favorite spreadsheet program suffers from plenty of limitations, in areas like:

  • Handling semistructured and unstructured data.
  • Tracking changes/version control.
  • Dealing with size restrictions.
  • Ensuring governance.
  • Providing security.

For now, suffice it to say that if you’re trying to analyze large, complex data sets, there are many tools well worth exploring. The same holds true for visualization. Never before have we seen such an array of powerful, affordable and user-friendly tools designed to present data in interesting ways.

Caveat 1: While software vendors often ape each other’s features, don’t assume that each application can do everything that the others can.

Caveat 2: With open source software, remember that “free” software could be compared to a “free” puppy. To be direct: Even with open source software, expect to spend some time and effort on training and education.

7. Do your employees possess the right skills to work on the data analytics project?

The database administrator may well be a whiz at SQL. That doesn’t mean, though, that she can easily analyze gigabytes of unstructured data. Many of my students need to learn new programs over the course of the semester, and the same holds true for employees. In fact, organizations often find that they need to:

  • Provide training for existing employees.
  • Hire new employees.
  • Contract consultants.
  • Post the project on sites such as Kaggle.
  • All of the above.

Don’t assume that your employees can pick up new applications and frameworks 15 minutes at a time every other week. They can’t.

8. What will be done with the results of your analysis?

A company routinely spent millions of dollars recruiting MBAs at Ivy League schools only to see them leave within two years. Rutgers MBAs, for their part, stayed much longer and performed much better.

Despite my findings, the company continued to press on. It refused to stop going to Harvard, Cornell, etc. because of vanity. In his own words, the head of recruiting just “liked” going to these schools, data be damned.

Food for thought: What will an individual, group, department or organization do with keen new insights from your data analytics projects? Will the result be real action? Or will a report just sit in someone’s inbox?

9. What types of resistance can you expect?

You might think that people always and willingly embrace the results of data-oriented analysis. And you’d be spectacularly wrong.

Case in point: Major League Baseball (MLB) umpires get close ball and strike calls wrong more often than you’d think. Why wouldn’t they want to improve their performance when presented with objective data? It turns out that many don’t. In some cases, human nature makes people want to reject data and analytics that contrast with their world views. Years ago, before the subscription model became wildly popular, some Blockbuster executives didn’t want to believe that more convenient ways to watch movies existed.

Caveat: Ignore the power of internal resistance at your own peril.

10. What are the costs of inaction?

Sure, this is a high-level query and the answers depend on myriad factors.

For instance, a pharma company with years of patent protection will respond differently than a startup with a novel idea and competitors nipping at its heels. Interesting subquestions here include:

  • Do the data analytics projects merely confirm what we already know?
  • Do the numbers show anything conclusive?
  • Could we be capturing false positives and false negatives?

Think about these questions before undertaking data analytics projects Don’t take the queries above as gospel. By and large, though, experience proves that asking these questions frames the problem well and sets the organization up for success – or at least minimizes the chance of a disaster.

SAS2

Most organizations understand the importance of data governance in concept. But they may not realize all the multifaceted, positive impacts of applying good governance practices to data across the organization. For example, ensuring that your sales and marketing analytics relies on measurably trustworthy customer data can lead to increased revenue and shorter sales cycles. And having a solid governance program to ensure your enterprise data meets regulatory requirements could help you avoid penalties.

Companies that start data governance programs are motivated by a variety of factors, internal and external. Regardless of the reasons, two common themes underlie most data governance activities: the desire for high-quality customer information, and the need to adhere to requirements for protecting and securing that data.

What’s the best way to ensure you have accurate customer data that meets stringent requirements for privacy and security?

For obvious reasons, companies exert significant effort using tools and third-party data sets to enforce the consistency and accuracy of customer data. But there will always be situations in which the managed data set cannot be adequately synchronized and made consistent with “real-world” data. Even strictly defined and enforced internal data policies can’t prevent inaccuracies from creeping into the environment.

sas3

Why you should move beyond a conventional approach to data governance?

When it comes to customer data, the most accurate sources for validation are the customers themselves! In essence, every customer owns his or her information, and is the most reliable authority for ensuring its quality, consistency and currency. So why not develop policies and methods that empower the actual owners to be accountable for their data?

Doing this means extending the concept of data governance to the customers and defining data policies that engage them to take an active role in overseeing their own data quality. The starting point for this process fits within the data governance framework – define the policies for customer data validation.

A good template for formulating those policies can be adapted from existing regulations regarding data protection. This approach will assure customers that your organization is serious about protecting their data’s security and integrity, and it will encourage them to actively participate in that effort.

Examples of customer data engagement policies

  • Data protection defines the levels of protection the organization will use to protect the customer’s data, as well as what responsibilities the organization will assume in the event of a breach. The protection will be enforced in relation to the customer’s selected preferences (which presumes that customers have reviewed and approved their profiles).
  • Data access control and security define the protocols used to control access to customer data and the criteria for authenticating users and authorizing them for particular uses.
  • Data use describes the ways the organization will use customer data.
  • Customer opt-in describes the customers’ options for setting up the ways the organization can use their data.
  • Customer data review asserts that customers have the right to review their data profiles and to verify the integrity, consistency and currency of their data. The policy also specifies the time frame in which customers are expected to do this.
  • Customer data update describes how customers can alert the organization to changes in their data profiles. It allows customers to ensure their data’s validity, integrity, consistency and currency.
  • Right-to-use defines the organization’s right to use the data as described in the data use policy (and based on the customer’s selected profile options). This policy may also set a time frame associated with the right-to-use based on the elapsed time since the customer’s last date of profile verification.

The goal of such policies is to establish an agreement between the customer and the organization that basically says the organization will protect the customer’s data and only use it in ways the customer has authorized – in return for the customer ensuring the data’s accuracy and specifying preferences for its use. This model empowers customers to take ownership of their data profile and assume responsibility for its quality.

Clearly articulating each party’s responsibilities for data stewardship benefits both the organization and the customer by ensuring that customer data is high-quality and properly maintained. Better yet, recognize that the value goes beyond improved revenues or better compliance.

Empowering customers to take control and ownership of their data just might be enough to motivate self-validation.

Click her to access SAS’ detailed analysis

The Innovation Game – How Data is Driving Digital Transformation

Technology waits for no one. And those who strike first will have an advantage. The steady decline in business profitability across multiple industries threatens to erode future investment, innovation and shareholder value. Fortunately, the emergence of artificial intelligence (AI) can help kick-start profitability. Accenture research shows that AI has the potential to boost rates of profitability by an average of 38 percent by 2035 and lead to an economic boost of US$14 trillion across 16 industries in 12 economies by 2035.

Driven by these economic forces, the age of digital transformation is in full swing. Today we can’t be “digital to the core” if we don’t leverage all new data sources – unstructured, dark data and thirty party sources. Similarly, we have to take advantage of the convergence of AI and analytics to uncover previously hidden insights. But, with the increasing use of AI, we also have to be responsible and take into account the social implications.

Finding answers to the biggest questions starts with data, and ensuring you are capitalizing on the vast data sources available within your own business. Thanks to the power of AI/machine learning and advanced algorithms, we have moved from the era of big data to the era of ALL data, and that is helping clients create a more holistic view of their customer and more operational efficiencies.

Embracing the convergence of AI and analytics is crucial to success in our digital transformation. Together,

  • AI-powered analytics unlock tremendous value from data that was previously hidden or unreachable,
  • changing the way we interact with people and technology,
  • improving the way we make decisions, and giving way to new agility and opportunities.

While businesses are still in the infancy of tapping into the vast potential of these combined technologies, now is the time to accelerate. But to thrive, we need to be pragmatic in finding the right skills and partners to guide our strategy.

Finally, whenever we envision the possibilities of AI, we should consider the responsibility that comes with it. Trust in the digital era or “responsible AI” cannot be overlooked. Explainable AI and AI transparency are critical, particularly in such areas as

  • financial services,
  • healthcare,
  • and life sciences.

The new imperative of our digital transformation is to balance intelligent technology and human ingenuity to innovate every facet of business and become a smarter enterprise.

The exponential growth of data underlying the strategic imperative of enterprise digital transformation has created new business opportunities along with tremendous challenges. Today, we see organizations of all shapes and sizes embarking on digital transformation. As uncovered in Corinium Digital’s research, the primary drivers of digital transformation are those businesses focused on addressing increasing customer expectations and implementing efficient internal processes.

Data is at the heart of this transformation and provides the fuel to generate meaningful insights. We have reached the tipping point where all businesses recognize they cannot compete in a digital age using analog-era legacy solutions and architectures. The winners in the next phase of business will be those enterprises that obtain a clear handle on the foundations of modern data management, specifically the nexus of

  • data quality,
  • cloud,
  • and artificial intelligence (AI).

While most enterprises have invested in on-premises data warehouses as the backbone of their analytic data management practices, many are shifting their new workloads to the cloud. The proliferation of new data types and sources is accelerating the development of data lakes with aspirations of gaining integrated analytics that can accelerate new business opportunities. We found in the research that over 60% of global enterprises are now investing in a hybrid, multi-cloud strategy with both data from cloud environments such as Microsoft Azure along with existing on-premises infrastructures. Hence, this hybrid, multicloud strategy will need to correlate with their investments in data analytics, and it will become imperative to manage data seamlessly across all platforms. At Paxata, our mission is to give everyone the power to intelligently profile and transform data into consumable information at the speed of thought. To empower everyone, not just technical users, to prepare their data and make it ready for analytics and decision making.

The first step in making this transition is to eliminate the bottlenecks of traditional IT-led data management practices through AI-powered automation.

Second, you need to apply modern data preparation and data quality principles and technology platforms to support both analytical and operational use cases.

Thirdly, you need a technology infrastructure that embraces the hybrid, multi-cloud world. Paxata sits right at the center stage of this new shift, helping enterprises profile and transform complex data types in highvariety, high-volume environments. As such, we’re excited about partnering with Accenture and Microsoft to accelerate businesses with our ability to deliver modern analytical and operational platforms to address today’s digital transformation requirements.

Artificial intelligence is causing two major revolutions simultaneously among developers and enterprises. These revolutions will drive the technology decisions for the next decade. Developers are massively embracing AI. As a platform company, Microsoft is focused on enabling developers to make the shift to the next app development pattern, driven by the intelligent cloud and intelligent edge.

AI is the runtime that will power the apps of the future. At the same time, enterprises are eager to adopt and integrate AI. Cloud and AI are the most requested topics in Microsoft Executive Briefing Centers. AI is changing how companies serve their customers, run their operations, and innovate.

Ultimately, every business process in every industry will be redefined in profound ways. If it used to be true that “software was eating the world,” it is now true to say that “AI is eating software”. A new competitive differentiator is emerging: how well an enterprise exploits AI to reinvent and accelerate its processes, value chain and business models. Enterprises need a strategic partner who can help them transform their organization with AI. Microsoft is emerging as a solid AI leader as it is in a unique position to address both revolutions. Our strength and differentiation lie in the combination of multiple assets:

  • Azure AI services that bring AI to every developer. Over one million developers are accessing our pre-built and customizable AI services. We have the most comprehensive solution for building bots, combined with a powerful platform for Custom AI development with Azure Machine Learning that spans the entire AI development lifecycle, and a market leading portfolio of pre-built cognitive services that can be readily attached to applications.
  • A unique cloud infrastructure including CPU, GPU, and soon FPGA, makes Azure the most reliable, scalable and fastest cloud to run AI workloads.
  • Unparalleled tools. Visual Studio, used by over 6 million developers, is the most preferred tool in the world for application development. Visual Studio and Visual Studio Code are powerful “front doors” through which to attract developers seeking to add AI to their applications.
  • Ability to add AI to the edge. We enable developers, through our tools and services, to develop an AI model and deploy that model anywhere. Through our support for ONNX – the open source representation for AI models in partnership with Facebook, Amazon, IBM and others – as well as for generic containers, we allow developers to run their models on the IoT edge and leverage the entire IoT solution from Azure.

But the competition to win enterprises is not only played in the platform battlefield, enterprises are demanding solutions. Microsoft AI solutions provide turnkey implementations for customers who want to transform their core processes with AI. Our unique combination of IP and consulting services address common scenarios such as business agents, sales intelligence or marketing intelligence. As our solutions are built on top of our compelling AI platform, unlike ourcompetitors, our customers are not locked in to any one consulting provider, they remain in full control of their data and can extend the scenarios or target new scenarios themselves or through our rich partner ecosystem.

AI Analytics

Click here to access Corinium’s White Paper

2018 AI predictions – 8 insights to shape your business strategy

  1. AI will impact employers before it impacts employment
  2. AI will come down to earth—and get to work
  3. AI will help answer the big question about data
  4. Functional specialists, not techies, will decide the AI talent race
  5. Cyberattacks will be more powerful because of AI—but so
    will cyberdefense
  6. Opening AI’s black box will become a priority
  7. Nations will spar over AI
  8. Pressure for responsible AI won’t be on tech companies alone

Key implications

1) AI will impact employers before it impacts employment

As signs grow this year that the great AI jobs disruption will be a false alarm, people are likely to more readily accept AI in the workplace and society. We may hear less about robots taking our jobs, and more about robots making our jobs (and lives) easier. That in turn may lead to a faster uptake of AI than some organizations are expecting.

2) AI will come down to earth—and get to work

Leaders don’t need to adopt AI for AI’s sake. Instead, when they look for the best solution to a business need, AI will increasingly play a role. Does the organization want to automate billing, general accounting and budgeting, and many compliance functions? How about automating parts of procurement, logistics, and customer care? AI will likely be a part of the solution, whether or not users even perceive it.

3) AI will help answer the big question about data

Those enterprises that have already addressed data governance for one application will have a head start on the next initiative. They’ll be on their way to developing best practices for effectively leveraging their data resources and working across organizational boundaries. There’s no substitute for organizations getting their internal data ready to support AI and other innovations, but there is a supplement: Vendors are increasingly taking public sources of data, organizing it into data lakes, and preparing it for AI to use.

4) Functional specialists, not techies, will decide the AI talent race

Enterprises that intend to take full advantage of AI shouldn’t just bid for the most brilliant computer scientists. If they want to get AI up and running quickly, they should move to provide functional specialists with AI literacy. Larger organizations should prioritize by determining where AI is likely to disrupt operations first and start upskilling there.

5) Cyberattacks will be more powerful because of AI—but so will cyberdefense

In other parts of the enterprise, many organizations may choose to go slow on AI, but in cybersecurity there’s no holding back: Attackers will use AI, so defenders will have to use it too. If an organization’s IT department or cybersecurity provider isn’t already using AI, it has to start thinking immediately about AI’s short- and long-term security applications. Sample use cases include distributed denial of service (DDOS) pattern recognition, prioritization of log alerts for escalation and investigation, and risk-based authentication. Since even AI-wary organizations will have to use AI for cybersecurity, cyberdefense will be many enterprises’ first experience with AI. We see this fostering familiarity with AI and willingness to use it elsewhere. A further spur to AI acceptance will come from its hunger for data: The greater AI’s presence and access to data throughout an organization, the better it can defend against cyberthreats. Some organizations are already building out on-premise and cloud-based “threat lakes,” that will enable AI capabilities.

6) Opening AI’s black box will become a priority

We expect organizations to face growing pressure from end users and regulators to deploy AI that is explainable, transparent, and provable. That may require vendors to share some secrets. It may also require users of deep learning and other advanced AI to deploy new techniques that can explain previously incomprehensible AI. Most AI can be made explainable—but at a cost. As with any other process, if every step must be documented and explained, the process becomes slower and may be more expensive. But opening black boxes will reduce certain risks and help establish stakeholder trust.

7) Nations will spar over AI

If China starts to produce leading AI developments, the West may respond. Whether it’s a “Sputnik moment” or a more gradual realization that they’re losing their lead, policymakers may feel pressure to change regulations and provide funding for AI. More countries should issue AI strategies, with implications for companies. It wouldn’t surprise us to see Europe, which is already moving to protect individuals’ data through its General Data Protection Regulation (GDPR), issue policies to foster AI in the region.

8) Pressure for responsible AI won’t be on tech companies alone

As organizations face pressure to design, build, and deploy AI systems that deserve trust and inspire it, many will establish teams and processes to look for bias in data and models and closely monitor ways malicious actors could “trick” algorithms. Governance boards for AI may also be appropriate for many enterprises.

AI PWC

Click here to access PWC’s detailed predictions report

 

The General Data Protection Regulation (GDPR) Primer – What The Insurance Industry Needs To Know, And How To Overcome Cyber Risk Liability As A Result.

SCOPE

The regulation applies if the

  • data controller (organization that collects data from EU residents)
  • or processor (organization that processes data on behalf of data controller e.g. cloud service providers)
  • or the data subject (person)

is based in the EU. Furthermore, the Regulation also applies to organizations based outside the European Union if they collect or process personal data of EU residents. Per the European Commission, “personal data is any information relating to an individual, whether it relates to his or her private, professional or public life. It can be anything from

  • a name,
  • a home address,
  • a photo,
  • an email address,
  • bank details,
  • posts on social networking websites,
  • medical information,
  • or a computer’s IP address.”

The regulation does not apply to the processing of personal data for national security activities or law enforcement; however, the data protection reform package includes a separate Data Protection Directive for the police and criminal justice sector that provides robust rules on personal data exchanges at national, European and international level.

SINGLE SET OF RULES AND ONE-STOP SHOP

A single set of rules will apply to all EU member states. Each member state will establish an independent Supervisory Authority (SA) to hear and investigate complaints, sanction administrative breaches, etc. SA’s in each member state will cooperate with other SA’s, providing mutual assistance and organizing joint operations. Where a business has multiple establishments in the EU, it will have a single SA as its “lead authority”, based on the location of its “main establishment” (i.e., the place where the main processing activities take place). The lead authority will act as a “one-stop shop” to supervise all the processing activities of that business throughout the EU. A European Data Protection Board (EDPB) will coordinate the SAs.

There are exceptions for data processed in an employment context and data processed security, that still might be subject to individual country regulations.

RESPONSIBILITY AND ACCOUNTABILITY

The notice requirements remain and are expanded. They must include the retention time for personal data and contact information for data controller and data protection officer must be provided.

Automated individual decision-making, including profiling (Article 22) is made disputable. Citizens now have the right to question and fight decisions that affect them that have been made on a purely computer generated basis.

To be able to demonstrate compliance with the GDPR, the data controller should implement measures which meet the principles of data protection by design and data protection by default. Privacy by Design and by Default require that data protection measures are designed into the development of business processes for products and services. Such measures include pseudonymizing personal data, by the controller, as soon as possible.

It is the responsibility and liability of the data controller to implement effective measures and can demonstrate the compliance of processing activities even if the processing is carried out by a data processor on behalf of the controller.

Data Protection Impact Assessments must be conducted when specific risks occur to the rights and freedoms of data subjects. Risk assessment and mitigation is required and prior approval of the Data Protection Authorities (DPA) is required for high risks. Data Protection Officers (DPO) are to ensure compliance within organizations.

DPO must be appointed:

  • for all public authorities, except for courts acting in their judicial capacity
  • if the core activities of the controller or the processor consist of
  • by their nature, their scope and/or their purposes, require regular and systematic
    monitoring of data subjects on a large scale
  • processing on a large scale of special categories of data pursuant to Article 9 and
    personal data relating to criminal convictions and offences referred to in Article 10
    processing operations which, for the purposes of national

GDPR in a Box

 

Click here to access Clarium’s detailed paper

State of Digital Analytics: The Persistent Challenge of Data Access & Governance

Disjointed, inaccessible data is a major productivity inhibitor for analytics teams, diverting skilled resources from contributing to valuable business intelligence.

Analytics teams struggle with data access. In addition to listing data silos and data access among both their top data and analytics challenges, above, nearly three in five said it takes days or weeks to access all the data needed for their work or the work of the teams they manage. Only a third were able to access all their data in a day or less.

AMOUNT OF TIME FOR ANALYSTS AND ANALYTICS TEAMS TO ACCESS DATA

Nearly two in five analytics professionals are spending more than half of their work week on tasks unrelated to actual analysis. Forty-four percent of managers reported that more than half of their team’s work week is spent accessing, blending, and preparing data rather than analyzing it, while 31 percent of analysts said they spend more than half of their work week on data housekeeping.

TIME SPENT PREPPING DATA, RATHER THAN ANALYZING IT

As a result, the majority of analysts have found it necessary to learn programming languages specifically to help them access and/or prepare data for analysis. Outside of mandates from their employers, a full 70 percent of analysts reported taking it upon themselves to learn to code for this reason, and more than a quarter of those analysts have spent 80 or more hours learning to program.

ANALYSTS LEARNING PROGRAMMING SKILLS TO OVERCOME DATA ISSUES

It should go without saying that data professionals tasked with analyzing organizational information meaningfully and actionably cannot adequately perform their core job function without accurate data. Yet in addition to raising the data access challenges above, the industry is also split in terms of confidence in data accuracy. Nearly half reported that they question the accuracy of the data they or the teams they manage use regularly, while a little more than half said they are confident about their data.

Data Analysis

Click here to access TMMData’s detailed Survey Results

Creating a Data-Driven Enterprise with DataOps

Let’s discuss why data is important, and what a data-driven organization is. First and foremost, a data-driven organization is one that understands the importance of data. It possesses a culture of using data to make all business decisions. Note the word all. In a datadriven organization, no one comes to a meeting armed only with hunches or intuition. The person with the superior title or largest salary doesn’t win the discussion. Facts do. Numbers. Quantitative analyses. Stuff backed up by data.

Why become a data-driven company? Because it pays off. The MIT Center for Digital Business asked 330 companies about their data analytics and business decision-making processes. It found that the more companies characterized themselves as data-driven, the betterthey performed on objective measures of financial and operational success. Specifically, companies in the top third of their industries when it came to making data-driven decisions were, on average, five percent more productive and six percent more profitable than their competitors. This performance difference remained even after accounting for labor, capital, purchased services, and traditional IT investments. It was also statistically significant and reflected in increased stock market prices that could be objectively measured.

Another survey, by The Economist Intelligence Unit, showed a clear connection between how a company uses data, and its financial success. Only 11 percent of companies said that their organization makes “substantially” better use of data than their peers. Yet more than a third of this group fell into the category of “top performing companies.” The reverse also indicates the relationship between data and financial success. Of the 17 percent of companies that said they “lagged” their peers in taking advantage of data, not one was a top-performing business.

But how do you become a data-driven company? According to a Harvard Business Review article written by McKinsey executives, being a data-driven company requires simultaneously undertaking three interdependent initiatives:

Identify, combine, and manage multiple sources of data

You might already have all the data you need. Or you might need to be creative to find other sources for it. Either way, you need to eliminate silos of data while constantly seeking out new sources to inform your decision-making. And it’s critical to remember that when mining data for insights, demanding data from different and independent sources leads to much better decisions. Today, both the sources and the amount of data you can collect has increased by orders of magnitude. It’s a connected world, given all the transactions, interactions, and, increasingly, sensors that are generating data. And the fact is, if you combine multiple independent sources, you get better insight. The companies that do this are in much better shape, financially and operationally.

Build advanced analytics models for predicting and optimizing outcomes

The most effective approach is to identify a business opportunity and determine how the model can achieve it. In other words, you don’t start with the data—at least at first—but with a problem.

Transform the organization and culture of the company so that data actually produces better business decisions

Many big data initiatives fail because they aren’t in sync with a company’s day-to-day processes and decision-making habits. Data professionals must understand what decisions their business users make, and give users the tools they need to make those decisions.

DD Enterprise

Click here to access the ebook Data Driven Organizations

A Field Guide to Data Science

  • Data Science is the art of turning data into actions.

It’s all about the tradecraft. Tradecraft is the process, tools and technologies for humans and computers to work together to transform data into insights.

  • Data Science tradecraft creates data products.

Data products provide actionable information without exposing decision makers to the underlying data or analytics (e.g., buy/sell strategies for financial instruments, a set of actions to improve product yield, or steps to improve product marketing).

  • Data Science supports and encourages shifting between deductive (hypothesis-based) and inductive (patternbased) reasoning.

This is a fundamental change from traditional analysis approaches. Inductive reasoning and exploratory data analysis provide a means to form or refine hypotheses and discover new analytic paths. Models of reality no longer need to be static. They are constantly tested, updated and improved until better models are found.

  • Data Science is necessary for companies to stay with the pack and compete in the future.

Organizations are constantly making decisions based on gut instinct, loudest voice and best argument – sometimes they are even informed by real information. The winners and the losers in the emerging data economy are going to be determined by their Data Science teams.

  • Data Science capabilities can be built over time.

Organizations mature through a series of stages – Collect, Describe, Discover, Predict, Advise – as they move from data deluge to full Data Science maturity. At each stage, they can tackle increasingly complex analytic goals with a wider breadth of analytic capabilities. However, organizations need not reach maximum Data Science maturity to achieve success. Significant gains can be found in every stage.

  • Data Science is a different kind of team sport.

Data Science teams need a broad view of the organization. Leaders must be key advocates who meet with stakeholders to ferret out the hardest challenges, locate the data, connect disparate parts of the business, and gain widespread buy-in.

Data Science Activities

2015-field-guide-to-data-science-160211215115