Articles about Data Architecture & Integration
Are modern data platforms on your gratitude list? When you’re dealing with an inflexible, monolithic technical architecture, getting the right information at the right time is like trying to cook a traditional Thanksgiving feast in a microwave. You need better tools for the job. Thankfully, modern data solutions like data mesh frameworks can help. Using a data mesh distributes information across autonomous domains, allowing business users to own, manage, and share their data in a separate virtual environment, while governance remains centralized. It’s the data equivalent of asking your cousins to make the side dishes while you handle the turkey and set the table. Interested in figuring out your options for modernizing and democratizing your data frameworks? Check out our latest deep dive on modern data solutions. Your data consumers will be grateful! Get your extra helping of modern data trends >>
Accessing data at the speed of business is critical to remaining competitive in a digital-first world. But if you’re relying on outdated architecture where your data is trapped in silos or lost in a data lake, access to the functional data you need is seriously limited. When your existing framework is no longer serving your business, it makes sense to transition to a modern data platform, but you may have hesitations about whether it can help you succeed. To help you better understand this solution and what you need to gain from it, we are looking at data platform capabilities and sharing five modern data platform imperatives that will help achieve a more logical data management system. What is a modern data platform? With so many emerging data solutions, we understand that data is a very complicated environment, so we want to start by clearly defining what a modern data platform is and its capabilities. A modern data platform is a flexible, cloud-based, end-to-end data architecture that supports collecting, processing, analyzing, and delivering data to the end user in a way that is aligned and responsive to the needs of the business. On the surface, aside from it being cloud-based rather than on-premise, modern data platform capabilities aren’t different from traditional data architecture. The difference is in how new technologies have expanded their capabilities. Here are some of the ways modern data platforms can deliver more for your organization: Data ingestion Bringing new data into the environment is the first step to managing data, and in a legacy architecture, that is mainly done through batch processing. Batching collects and processes data at specific time periods or intervals. By leveraging the higher computing capacity of a cloud-based architecture, data can be streamed in real time to data storage units, eliminating bottlenecks and delays to keep data moving through the system in a more fluid manner. Quality and governance With AI integrated into the architecture, data quality and governance tools can be automated, speeding up how new data sources are analyzed, categorized, and assessed for security concerns. Security Security measures can be integrated at the base level for new data products, providing inherent encryption whether it’s at rest or in transit. Within a modern data platform, security measures are implemented to dynamically filter and obscure data as needed to support your organization’s security policies. Storage Cloud-based architecture offers the potential for nearly unlimited storage and offers a pay-as-you-go model, so you only need to invest in the volume of storage you need today. As your data storage needs increase in the future, you can add and seamlessly integrate additional space without creating silos for new data. Transformation In legacy architecture, transformations such as quality adjustments and business logic need to be applied in the early stages of data flow during large batch processing. While this ensures that the downstream usage of the data is more performant, it also locks the business rules in place which removes flexibility in how the business looks at and interacts with the data. The expanded computing power and advanced tools in a modern data platform offer a more flexible timeline to add transformations to the data. Business rules and logic can be applied later in the data flow and adapted to suit changing needs. Discovery Data discovery is streamlined through integrated tools within a modern data platform that can automatically scan, categorize metadata, and organize it so the most appropriate data is accessed more easily and quickly. Delivery In a legacy architecture, data delivery visualization tools required the data to be specifically structured prior to business usage, whether for reporting, data extracts, or API access. Now, visualization tools have advanced features that support access to semi-structured and unstructured data without the need for intensive (and expensive) data processing. Integrated tools simplify both data extraction and data sharing and have built-in security and monetization features. DevOps and DataOps In a modern data platform, DevOps/DataOps are cross-platform and cross-language supportive, which makes it easier and faster to coordinate development and release implementation tasks when architectures are built using multiple tools. 5 modern data platform imperatives The overall framework, capabilities, and patterns of managing data are universal within a modern data platform. However, no two platforms are the same. Each one is highly customized to support the data and data needs of the organization and require different combinations of tools or features to achieve specific functionalities and cover the needed capabilities. You still need to ensure your platform manages the data in a way that aligns to your organization’s unique needs, and this means that five modern data platform imperatives must be met. 1. Greater flexibility The greatest challenge of legacy data architecture is the lack of flexibility. The physical servers can’t be added to or modified easily to meet the changing data needs of your organization, so they need to be built with the capacity for future data needs. This is easier said than done given the rapidly changing landscape and the sheer volume of data you’re taking in. A modern data platform is incredibly flexible. It allows you to consider your data needs today and budget accordingly rather than trying to predict your data needs in the future which requires a significantly larger investment. As you need to increase data storage, adopt automation, or pivot in your data needs, these updates can be integrated seamlessly into the platform. 2. Improved access The people and applications accessing data need it in real time and in the proper format, but the needs of your data science team vary greatly from the needs of your business intelligence team. A modern data platform must support a faster time to market for data assets, and one way it does this is through a medallion architecture. A medallion architecture creates a multi-layered framework within the platform to move data through a pipeline to the end user. Bronze layer: Raw data is collected directly from the source systems with little to no transformation and stored here to provide a base layer of full history for additional processing. Silver layer: Data from multiple sources are curated, enriched, integrated, and organized in a structure that reflects the data domains of the organization. Gold layer: Data needed to support specific business drivers is aggregated and organized so it can be used for dashboard creation and self-service analysis of current states and trends. This architecture allows a diverse user base to access the data in the form that best suits their needs. Data scientists can access raw data from the bronze layer to identify new and emerging patterns, business applications can access data in the silver layer to produce data products, and business users can access the gold layer to perform analytics and create dashboards. 3. Incremental implementation Rather than transitioning to a modern data platform in a single, giant step, we recommend an incremental move. This makes it significantly easier and faster to focus on the current data products your organization needs, like reports and dashboards, while you are starting to build out the initial infrastructure. An incremental implementation lets you take a clear, informed look at the data you need, how you need it, and how it aligns with your business drivers. You can then choose to add, adjust, or stop processing certain data to put more focus on the data that will answer pivotal business questions. At the same time, building only what you need when it’s needed, an incremental implementation saves money and avoids bringing over old data that no longer serves your business. 4. Better communication between IT and business users A modern data platform needs to support improved communication between your IT or data engineers and your business users. As data flows through the framework and reaches the end user in the language they speak, the end-user has greater clarity. For business users, this may mean seeing gaps in how the existing data is not directly answering their questions and needs to find a different way to utilize the data. For the data engineers, this may mean seeing opportunities in how to filter out aberrations in the data to improve the aggregated data. This clarity allows the teams to work together to target solutions that will cover existing or emerging needs. 5. Re-focus valuable resources Once the initial data set is built, we apply repeatable patterns to the mechanics controlling data ingestion, storage, and delivery. Having a proven framework that can be repeated to unlimited data sets saves time and reduces the cost of building, operating, and maintaining the platform. Your data team can refocus their time on higher-level tasks, including improving data quality and speeding up delivery. Whether you have questions about data platform capabilities and functionalities or you’re ready to make the shift to a modern data platform, we’re here to help! Set up a call to talk to an expert or visit our modern data platform hub to learn more. Ask us your questions >> Learn more about modern data platforms >>
In a perfect world, all your data would be stored in an updated, organized database or data warehouse where your business intelligence and analytics teams could keep your company ahead of the competition by accessing the precise data they need in real time. In reality, as your organization has grown, your data has probably been stretched across multiple locations, including outdated databases, localized spreadsheets, cloud-based platforms, and business apps like Salesforce. This not only causes costly delays in accessing information, but also impacts your teams’ ability to make informed, data-driven decisions related to both day-to-day operations as well as the long-term future of your organization. So, how do you improve access to your data when it’s siloed in multiple areas? Data virtualization, while still fairly new, is an efficient, effective data delivery solution that offers real-time access to the data your teams need, and it is rapidly growing in popularity among large to enterprise-level organizations. While the market was estimated at $1.84 billion in 2020, a 20.9 percent CAGR has the data virtualization market projected to go beyond $8 billion by 2028 according to a 2022 Verified Market Research report. To help you determine if data virtualization solutions are the best option for your company, we’ll take a look at what data virtualization is, how it can solve your greatest data challenges, and how it stacks up to other data integration solutions. Understanding data virtualization First, what is data virtualization? When you have data housed across multiple locations and in various states and forms, data virtualization integrates these sources into a layer of information, regardless of location or format, without having to replicate your information into new locations. While this layer of data is highly secure and easily managed within governance best practices, it allows the data consumers within your organization to access the information they need in real time, bypassing the need to sift and search through a variety of disparate sources. Data virtualization supports your existing architecture Data virtualization does not replace your existing data architecture. Instead, it’s a single component in a larger data strategy, but it is often essential in executing the strategy successfully and meeting the goals of your organization. Think of your current data architecture as an old library where your data is kept on a variety of shelves, over multiple floors, and some of it is even stored in boxes in the basement. When you are looking for specific information, you have to go on an exhaustive, lengthy search, and you may not even find what you need. Data virtualization acts as the librarian who understands the organizational system, knows exactly where everything is located, and can provide you with the information you need immediately. Choosing data virtualization vs an ETL solution When reporting is delayed, analytics are inaccurate, and strategic planning is compromised due to bottlenecks, it’s essential that your organization prioritizes how data is integrated and accessed. Traditionally, organizations would only choose Extract, Transform, and Load (ETL). ETL is an intensive process in which all your data is duplicated from the original sources and moved into a data warehouse, database, or other storage. While ETL can bring your data together, there are two key problems with this method. The cost of moving and relocating data is often the chief concern most organizations have. And while it does improve your collection of data by keeping it siloed in one location, it doesn’t improve your connection to analyzable data that is needed to improve day-to-day operations. On the other hand, data virtualization solutions streamline how you access and connect to your data. Your business users submit a query, and the Denodo data virtualization platform pulls the data from across locations, extracts the relevant information and delivers it in real time in the needed format so it’s ready to analyze and use. The result? Increased productivity, reduced operational costs, and improved agility among business users while your architects and IT teams have greater control on governance and security. Take a deeper dive into data virtualization solutions Ready to dig deeper into data virtualization? We partnered with data management leader Denodo Technologies to put together Modernizing Integration with Data Virtualization, a highly informative webinar to help you learn how data virtualization helps your company save time, reduce costs, and gain better insight into your greatest asset. To learn how Fusion Alliance can create custom data virtualization solutions to scale your data management and improve access, reach out to our team. Ask us any questions or set up a quick call to explore your options. Learn more about modern data platforms >>
Whether your data is housed in a monolithic data architecture or across multiple, disparate sources such as databases, cloud platforms, and business applications, accessing the specific information you need when you need it probably presents a huge challenge. The length of time it takes to find data may have you or your analytics teams constantly relying on outdated information to run reports, develop strategies, and make decisions for your organization. If you’re exploring data solutions that will improve time-to-market while simplifying governance and increasing security, you’ve probably come across the terms “data fabric” and “data mesh,” but you may not know how to apply them to your business. To help you better understand these emerging trends in data architecture, we’re digging into what a data fabric and data mesh are and the specific benefits they bring to large and enterprise-level organizations. This will give you the foundational knowledge to determine how to choose data fabric vs data mesh or how both may be able to serve your organization. What is data fabric? When you think of every bit of data in your organization as an individual thread, it makes sense that it takes so long to access specific information. If thousands of individual threads are stored together in a bin, like in a monolithic architecture, or separated across hundreds of individual boxes with little to no organizational method, like in a distributed architecture, how long would it take to find the single thread you’re looking for and get it untangled so you can use it? A logical data fabric solves this problem by weaving all the threads of data together into an integrated, holistic layer that sits above the disparate sources in an end-to-end solution. Within the layer are multiple technologies working together to catalog and organize the data while machine learning and artificial intelligence are implemented to improve how new and existing data are integrated into the fabric as well as how data consumers access it. Are data virtualization and data fabric the same? A common misconception is that data virtualization and data fabric are the same. On the surface, they both support data management through the creation of a single, integrated layer of processed data atop distributed or unstructured data. Data virtualization is an integrated abstraction layer that speeds up access to data and provides real-time data returns, and this technology is a key component within the data fabric. However, data virtualization is still only one of the multiple technologies comprising the entity, which is a more comprehensive data management architecture. Benefits of data fabric Now that you have a better understanding of what data fabric is, let’s consider the problems it solves and why it may be right for your organization. Access your data faster When your data is in multiple formats and housed in a variety of locations, gaining access to the specific details you need can take hours, days, or even weeks, depending on your architecture. A logical data fabric leverages metadata, semantics, and machine learning to quickly return the needed data from across multiple sources, whether it’s a large amount of historic information or highly specific data used to drill down into a report. Democratize your data Data fabric uses advanced semantics, so the data is accessible in the language of business users, such as BI and analytics teams. Data consumers within the organization can access what they need without having to go through data engineers or the IT department, eliminating bottlenecks and sharing ownership of data. Improve governance Because of the automation capabilities of data fabric, you can implement a governance layer within the fabric. This applies global policies and regulations to data while allowing local metadata management to reduce risk and ensure compliance. What is data mesh? Monolithic data architecture keeps data in one centralized location. On paper, this seems like a more cost-effective, efficient option compared to a distributed architecture, but it still brings several challenges. Consider that in many large organizations relying on a monolithic architecture, massive volumes of unstructured data are stored in a data lake. For information to get into the hands of data consumers or before productization can occur, the data must be accessed and processed through the IT department, creating significant bottlenecks and bringing time to market to a crawl. A data mesh can solve this challenge. This is a new type of data architecture, only proposed in 2019 by Zhamak Dehghani of Thoughtworks, in which a framework shifts data from a monolithic architecture to a decentralized architecture. More specifically, the data is distributed across autonomous business domains where the data consumers own, manage, and share their own data where they see fit. While the domains are given a separate virtual schema and server so they can have full ownership over data productization, governance, security, and compliance are still unified within the monolith. Benefits of data mesh The challenges of centralized data ownership include latency, added costs of storage, software, replication, and lack of practical access for consumers, but implementing a data mesh can solve these. Eliminate IT bottlenecks When all data is forced to go through the IT department before being distributed to the individuals or teams requesting it, bottlenecks occur and slow down the flow of data. A data mesh allows data to bypass the IT department, allowing data to flow freely to the needed source. Improve flexibility and agility Finding specific information within the massive volume of unstructured, undefined data stored in a data lake requires increasingly complicated queries to get the needed information. However, a data mesh gives ownership of datasets to individual teams or business owners, simplifying access and offering real-time results through scalable, automated analytics. Increase connection to data By transferring data ownership to the data consumers, those who use it directly have a greater connection to it. The data is available in the language of business, and it can be shared across teams with greater ease and transparency. Choosing data fabric vs data mesh Data fabric and data mesh both support data democratization, improve access, eliminate bottlenecks, and simplify governance. While data fabric is built on a technology-agnostic framework to connect data across multiple sources, data mesh is an API-driven, organizational framework that puts data ownership back in the hands of specific domains. So, which is better in the debate between data fabric vs data mesh? The simple answer is neither one is better than the other, and the right option is determined by the use case. If the goal of your organization is to streamline data and metadata to improve connection and get real-time results across multiple teams, a data fabric built on a data virtualization platform can help you meet your goals. On the other hand, if you need to improve the process of data productization and decentralizing your data, a data mesh may be the best option. But the real answer is that contrary to popular belief, the two are not mutually exclusive and most businesses succeed by implementing both options. Data fabric and data mesh are complementary solutions that can work together to solve the challenges of your existing architecture. Learn more about data fabric and data mesh Want to gain further insight into choosing data fabric or data mesh? We partnered with data management leader Denodo Technologies for a recorded webinar. In Logical Data Fabric vs Data Mesh: Does It Matter? we provide an in-depth look at monolithic and distributed data architecture, the challenges they bring, and how both data fabric and data mesh can improve agility, reduce costs, and elevate the quality of your data. To ask additional questions or learn how Fusion Alliance can help you create and implement a successful data strategy to meet your unique challenges and goals, connect with our team today. Learn more about modern data platforms >>
The importance of data classification Often presented as a click-bait internet poll, the question “Is cereal a soup?” is only baffling until you realize that the answer hinges on how you define the term. Merriam-Webster contends that soup is a liquid sustenance often containing pieces of solid food. Therefore, as one respondent said, cereal is a soup “technically, though not existentially.” Proper definition of terms is also critical when it comes to classifying your data. To get the most from your data assets, you’ll need a strong data strategy, supported by definitions like: How information is grouped, weighted, and prioritized How common dimensions will be conformed How data will be standardized, cleansed, and tagged Your data use cases, sources, and architecture are unique. How you define your data strategy should be, too. Fusion’s team of data, technology, and digital experts can help you architect and implement a comprehensive data strategy, offer insights and best practices to support a growing data culture, or step in to solve a particular problem. Don’t let data eat your business for breakfast. Learn more about defining your data terms or get in touch for a quick consultation.
Today’s businesses collect more data than ever before, but many don’t have the architecture in place to store, process, and recall the data in real time. Whether an enterprise-level organization stores all its data in a single data lake or relies on multiple, disparate sources, both options cause significant delays in finding the specific information you’re looking for. Traditionally, if your organization wanted to update and upgrade the existing architecture, the only option was extract, transfer, and load (ETL) the data to a new framework but implementing a logical data fabric offers a better alternative — giving companies a cost-effective, efficient way to collect and integrate data while building a stronger framework across the organization. At a recent CDO Data Summit, Mark Johnson, Fusion Alliance Executive Vice President and editorial board chair for CDO magazine, sat down with thought leaders in the data industry to discuss why logical data fabric is essential in accelerating time to value. What is a logical data fabric? When you have multiple disparate data sources, a data fabric acts like a net cast over the top, pulling individual information sets together in an end-to-end solution. Data fabric is a technology-driven framework that lies within the existing architecture, unlike a data mesh, which is a methodology regarding how data should be distributed among data owners and consumers. In a logical data fabric, multiple technologies are implemented to catalog and organize existing data and integrate new data into the fabric. Data virtualization is the central technology deployed within this framework, creating an abstracted layer of unified data that is more secure and easily accessible. What challenges are solved by a data fabric architecture? Logical data fabric architecture offers a solution to the challenges organizations relying on numerous data storage solutions or repositories of structured and unstructured data face: Overcome slow data delivery By consolidating data into an integrated semantic layer, common business applications can process, analyze, and return the data in real time, in the language of the data consumer. This improves accessibility and significantly reduces latency that comes from applications having to search across multiple sources to return information. Simplify governance If every data warehouse, database, and cloud-based platform within your organization relies on separate governance, you are dealing with significant inconsistencies. By stitching the data together in a logical data fabric, centralized governance can be applied across all data and automated to maintain and streamline the process. Reduce IT bottlenecks Data fabric automates how data is processed, integrated, governed, and utilized, enabling real-time analytics and reporting. This puts data in the hands of your BI and analytics teams more quickly while removing bottlenecks from your IT department. With a logical data fabric architecture, your business can respond to trends and changes within your industry more quickly, helping you to evolve both short and long-term strategies to reflect what your data is telling you in real time. Is a logical data fabric the right solution for your organization? Learn more about data fabric architecture from the CDO Data Summit’s round table discussion. Mark Johnson is joined by: Baz Khauti, President at Modak USA Richie Bachala, Principal, Data Engineering at Yugabyte Ravi Shankar, SVP and Chief Marketing Officer at Denodo Saj Patel, VP of Data Solutions at Fusion Alliance This panel addresses critical questions about data in today’s business to help you solve your unique data challenges, including: Is the fabric of data virtual, physical, or both? How do we get value out of our data? Do we take a connect or collect approach? How comprehensive do we need our data approach to be? Are we optimizing for agility or for flexibility? How do we deliver unified data? Is the organization in agreement with what we are looking for out of their data? What AI/ML techniques do we want to employ, if any? If you have specific questions or are ready to take the next step and learn how we can help you create custom data solutions for your organization, reach out to us today for a quick chat! Learn more about modern data platforms >>
Although still gaining momentum, data virtualization is on a fast track to address the challenges with traditional integration solutions, namely faster time-to-market for data and business capabilities, access to a broader range of data across your data ecosystem, and providing an integrated solution for management and governance. Data virtualization enables access to data without replicating or storing any of the data. It’s essentially a virtual layer of information that allows you to access and integrate data from various sources and systems seamlessly. But what are the typical use cases of data virtualization, and what are some of the challenges businesses encounter when trying to put it in place? Here we’ll dive into both questions, along with the potential opportunities data virtualization provides. Common uses of data virtualization Introducing data virtualization into an organization is generally use-case-driven. If your company fits into one of the following three use cases, it may benefit from implementing this type of holistic strategy. You have data in numerous locations The primary use case for data virtualization occurs when companies have data in multiple locations. For instance, if your business has migrated data to the cloud or multiple cloud locations, but still has data on-premises, virtualization can pull all that data into one access point. Virtualization is a great candidate to make siloed information look united to the business, even when the data exists in separate environments. But virtualization doesn’t just affect appearances: it also makes data from disparate sources simpler to access, which benefits the users. For instance, many companies collect and store customer data in multiple platforms, which can make it difficult for the organization to discern a true 360° view of the customer. Data virtualization can seamlessly integrate the data across platforms to present a single, unified view — saving time over manual analysis and reducing the risk of key data points falling through the cracks. Learn more about customer data strategy >> You’re trying to migrate to the cloud At this point, most companies are trying to modernize and move to the cloud to save money and time. But not all companies can move the data that quickly and completely abandon the legacy system. Instead, they migrate data a little bit at a time, which can be a tedious and lengthy process. Data migration projects can take months, and during that time, business users could spend significant time finding, reconciling, and analyzing the data manually. The result is a considerable loss of business opportunity; unable to respond to the needs of the business for data. Virtualization bridges the gap during this transition period, streamlining effort and boosting efficiency in the short term, so data can be migrated over time without negatively impacting business users. Learn more about cloud migration strategy >> You want to move from a DWH to a DaaS model Some companies prefer to bypass putting data into their traditional data warehouse (DWH) in favor of a data-as-a-service (DaaS) solution. For these organizations, the time-to-value savings of getting data into the hands of users more quickly overrides the case for standing up and maintaining their own DWH. Data virtualization enables companies to bypass the need to create ETL processes entirely and serve up unified data views from any combination of DaaS and legacy sources. As long as your organization has thoughtful governance in place and has considered the potential privacy and security impact, data virtualization can quickly harmonize a DaaS strategy. Learn best practices for evaluating your data storage options >> Common roadblocks to data virtualization If you’re considering data virtualization for your business, there are some potential constraints to consider. Incomplete MDM framework. If your company has outstanding master data challenges to address, it is best to have a master data strategy defined and solutioning options incorporated into the data architecture before data virtualization can take full shape. Mastering data is often a process and organizational change management problem to solve. Disbursed subject matter expertise. Creating a data virtualization solution requires thorough knowledge of your data, the business rules surrounding your data, and a strong understanding of the business needs for using that data. Since data virtualization brings disparate data together, the subject matter expertise on the various data domains can be spread throughout the organization. Identifying these SMEs and ensuring engagement of these individuals is a key enabler to achieving success with data virtualization. Governance issues. You never want to overlook governance in a rush to meet business requirements. Accountability and ownership of data are essential tenants of a successful data management framework. Before implementing a data virtualization project, be sure you have a solid governance operating model in place to ensure security, compliance, and data quality. Although data virtualization can be a transformative solution for many companies, it’s not your only option. Sometimes the use case isn’t quite there, or privacy and governance concerns outweigh the potential value of a data virtualization effort. Fortunately, there are multiple ways to realize the value of your data. Explore your data integration & architecture options >> Setting yourself up for data virtualization success Data virtualization can be an excellent solution for businesses struggling with integration challenges that are preventing the speed and scale of business growth. Collecting data from multiple platforms and presenting it in one unified view for business users streamlines workflows and makes data easier to use and digest across the organization. It’s not a one-size-fits-all solution, but for certain use cases, data virtualization offers significant value. How do you determine if data virtualization is a good fit for your business? How can you define and evaluate potential use cases to understand the potential pitfalls and weight them against an accurate projection of benefits? In our Data Virtualization Discovery Workshops, expert teams walk you and your key stakeholders through your unique constraints and opportunities, identifying the right next steps to advance your data management strategy. Starting with your current state architecture and building toward a true 360 view of your data, we’ll work with you to determine if data virtualization is a good fit, and which use cases will help you realize its value for solving some core business problems. Have a question about data virtualization? Ask us anything >> Ready to get started? Explore our Data Virtualization Discovery Workshops >>
This article originally appeared in CDO magazine. Data and analytics have long held promise in helping organizations deliver greater value across the entire stakeholder landscape, including customers, associates, and partners. However, since the beginning of the data warehousing and BI movement, achieving business value rapidly — in alignment with windows of opportunity — has proven elusive. For an organization to be competitive in the era of digital transformation, data must be front and center — and accessible in near real-time. But many organizations are struggling with data that is deeply buried, complex to access, difficult to integrate, and inaccessible to business users. Problems like these diminish the value of your data and its ability to inform decision-making at all levels. For most organizations, it’s hard to produce value from data quickly The main challenge has been the distributive and siloed nature of the data subjects that need to be integrated to achieve business-relevant insights. Data subjects — customers, products, orders, warehouses, etc. — typically reside in different systems/databases, requiring extraction, transformation, and loading into a common database where analytics can be mounted. Often, data delivery solutions like data warehouses, self-service BI, and data lakes are used to try and unlock these data silos; however, each of these solutions presents drawbacks in terms of effort, complexity, cost, and time-to-market. That is where data virtualization comes in and delivers a holistic view of information to business users across all source systems. So what exactly is data virtualization? In its simplest form, data virtualization allows an organization to attach to its data subjects where they reside in real-time. It presents disparate data subjects through a semantic layer that enables them to be integrated on the fly to support query and analytic use cases. By eliminating the need to design and build complex routines that move data from multiple source locations into a single integrated data warehouse, products like Denodo enable organizations to compress weeks to months of data preparation time out of the idea-to-execution value stream. As a result, value delivery is significantly accelerated. Learn more about how Fusion & Denodo can help you streamline data access to support your most critical business needs >> Optimization with data fabric. While data virtualization integrates data from different sources into one layer to provide real-time access, data fabric is an approach with end-to-end architecture that allows organizations to manage massive amounts of data in different places and automates the integration process. The thing about data fabric is that it has a huge job to do and must have a robust integration backbone to do it. A data fabric must support many data sources, be compatible with several data pipeline workflows, support automated data orchestration, empower various kinds of data consumers, and more. To do this successfully, a data fabric requires powerful technologies and a solid data integration layer to access all data assets. Many in the data community believe that you must choose data virtualization OR data fabric, but that is not the case — and that solid data integration layer is an example of why. The reality is that data fabric can be operationalized through data virtualization and optimizes your modern data architecture, allowing you to move with the speed of your business. By building a model that utilizes both concepts, businesses make finding, interpreting, and using their data near seamless. Technology by itself isn’t the answer. Even with the proven results of this class of technologies, many organizations continue to struggle with traditional data management and analytic architectures and solutions. This inability to adopt new approaches for data management and analytics only serves to deprive decision makers of rapid access to insights that are necessary to support agility in the pandemic-induced, rapidly transforming digital/global economy. The solution is not just found in technology. Instead, it is found in the minds of the humans responsible for delivering data management and analytic capabilities. It is a human change management problem we face. Remember the adage, people/process/data/technology? The next frontier to be conquered is optimizing the thinking and innovation risk tolerance of stewards of data management and analytics solutions within organizations. What do you think? Is your organization facing any of these issues or trying to tackle how to deliver significant value — better, faster, cheaper, smarter? I’m happy to chat about where you’re at and how to get where you would like to be. If you want to talk, send me a note.
Traditionally, IT experts created data assets for business users upon request. Of course, people still went rogue, capturing, analyzing, and visualizing data in Excel “spreadmarts” outside of IT, but their potential for damage was limited. Today, as next-generation business intelligence (BI) tools become increasingly powerful and self-service enabled, and as global privacy laws and regulatory requirements increase, businesses without strong data management and governance programs face much greater risk. What is data democratization, and how can your business ensure that self-service data asset development doesn’t trigger chaotic — and costly — consequences? Data management best practices can help you: Keep up with the pace of information needs outside of IT without spawning ungoverned “Shadow IT” practices Manage existing Shadow IT practices, particularly if your organization adds substantially more powerful BI tools to the mix Develop a more open data culture while also valuing privacy, security, and good governance The solution lies in finding the right balance between increasing demands for data governance and the rapidly escalating need for data access. What causes shadow IT — and why it can be dangerous Growing demands for data-driven insights accelerate the demand for knowledge workers to get information when they need it and in a format they can use. As these requests for data insights balloon, IT departments quickly get backlogged. To solve the problem, businesses sometimes turn to self-service data tools, particularly in the BI space. These tools reduce repetitive demands on IT time while enabling users to personalize how they access and view data in their own channels. Tools like Tableau and Alteryx provide rich data visualization, which further speeds time to insight. Learn more about business intelligence (BI) options >> While data preparation used to require highly technical skills and toolsets to extract, transform, and load information and generate reporting, data democratization puts significantly more power in the hands of average business users. Business users can now do work that the savviest Excel-wielding shadow IT veteran never dreamed of. Flattening XML, geo-spatial custom polygon prep, blending, cleansing, creating predictive models, regressions, neural networks, and Naïve Bayes classifiers can be created and used without any traditional IT development knowledge. But data democratization has a dark side. Businesses can get into trouble when data democratization isn’t closely paired with data governance and management. Without a carefully cultivated data culture that understands data governance and management, shadow IT’s ballistic upgrade puts businesses at risk. Data management best practices for risk mitigation As data democratization becomes more of a reality in your organization, data management migrates from your IT and security teams to every business unit. Implementing data management across the business requires clear communication and leadership commitment. Audit your information ecosystem First, take stock of your current state in terms of data intake, preparation, access, and use. Take a fair and honest look at the data management practices and acknowledge where pockets of shadow IT exist. While Excel is obviously ubiquitous, understanding who has licenses for some of the newer tools, like Alteryx, may be a good place to start. When pockets are identified, ask some fundamental questions, like: What information is the business craving? Which tools or solutions have they tried? How are these tools being used? Is this the best tool or multi-tool solution for the job? Is there any overlap or duplication of assets across the business? What assets have they developed that could benefit a larger group or even the enterprise? Shift your data management mindset Then, resist the temptation to scold. The historical data management mindset toward those who created these one-off information stores needs to be turned on its head to focus on problems and solutions rather than reprimands. In light of more useful one-off data storage, you may find it hard to rationalize all of your current IT-generated assets. The cost to maintain them, particularly if they’re not actually being used, makes them liabilities, not assets. Taking the time to define what the business needs, and then collaborating on the process, information requirements, tooling, and, potentially, infrastructure and architecture solutions that would best meet and scale to fit those requirements is a far healthier approach. Then, your company not only creates a self-service machine that can keep pace with demand, but also goes a long way toward building a healthy data culture. How to build a strong data culture >> Get clear on good governance The term governance gets thrown around a lot, but does your organization have a clear idea of what you mean by it when it comes to your data? It’s not enough for IT to have documented policies and controls. A mature governance program must be seated in the business. Once again, effective processes begin with business requirements. While IT may bear responsibility for implementing the actual controls to provide row-level security or perspectives, the business must provide definitions quality rules, lineage, and information to inform and support governed access. In this sense, IT becomes the stewards responsible for ensuring those business-driven governance requirements are met. As your organization progresses toward data democratization, keep the following best practices in mind: Establish processes and workflows to bring democratized data and data assets under governance efficiently Co-create governance rules and standards with business units, and be sure they are communicated clearly to all data users Maintain governance requirements, quality rules, and access architectures that make data and data assets suitable and consumable by others within the organization How data governance fits into strategic data management >> Build a bridge between democracy and governance Although bringing the creation and persistence of data assets into the controlled IT fold is critical for good governance, allowing the business to quickly and freely blend, experiment, and discover the most effective fit-for-purpose data sets for their information needs takes the burden off of IT to try to figure out what the business needs. How do mature data organizations bridge the gap between democratized data and good governance? Workflows. Workflows bring democratized asset development and IT-implemented controls together. A strong data workflow, including how requests are processed, prioritized, reviewed, and either approved or rejected, is the critical gatekeeper that prevents democratization from turning into chaos. Your workflow should address: Data submission: Workflow is the process established by which data assets are submitted for enterprise or departmental consideration as governed assets and persisted according to IT’s self-service standards. Identifying the roles, process (inputs, outputs, gates), and relevant governance structure is fundamental to get a meaningful workflow in place. Data request backlog: Not every one-off dataset is an asset – the validity of the data produced must be verified to examine the lineage of the data and any transformation logic (e.g., joins, calculations) that was used in its creation. Data scoring: The usefulness of the data must be scored or assessed in some objective way to determine if it should be published and to whom. Data access and security: The workflow process should also address access and security requirements. By streamlining the information demand management process and making it more efficient, your IT team can shift focus to providing higher-value data and information for the business, while potentially driving down cost by retiring the production of lower-value reports or marts. Learn more about how to manage data as an asset >> Manage change well Shadow IT is called that for a reason. To get those datasets and those who create them to willingly step into the light is a culture shift that requires effective change management and clear communication. Creating an environment that encourages the creation of self-service and democratized data asset development by the business is important, but, when unchecked, can result in the proliferation of potentially redundant or conflicting data sources, none of which are under IT’s purview. Responsible development and management of all data assets within the organization requires balance, oversight, and commitment to change. Democratizing data holds huge potential for your business when it’s well managed and governed. Not sure where your company stands? Maybe a quick assessment could help. Our team of data experts can help you get clarity with a customized consultation, workshop, or audit designed to fit your needs. Let us know what’s on your mind >> Learn more about data strategy and how to get started >>
Developing a culture of and commitment to viewing data as an asset within your organization not only ensures good governance — and compliance with evolving privacy regulations — it also gives your business the insights needed to thrive in the rapidly changing digital world. Understand the lifecycle of data as an asset To encourage good data management processes, it’s important to understand the lifecycle of a data asset originating outside of IT. In these cases, data from multiple sources is blended and prepped for consumption, which typically includes steps to validate, cleanse, and optimize data based on the consumption need — and because these processes happen outside of IT, be on the lookout for potential security or governance gaps. While individual circumstances vary, from a big picture perspective the data asset development lifecycle generally follows these steps: Intake: Data assets can only be created or derived from other datasets to which the end-user already has access. While traditionally this was more focused on internal datasets, blending with external data, such as market, weather, or social, is now more common. Ask: How are new requests for information captured? Once captured, how are they reviewed and validated? How is the information grouped or consolidated? How is the information prioritized? Design: Once the initial grouping takes place, seeing data as an asset requires thoughtful design that fits in with the structure of other data sets across the organization. Ask: How will new datasets be rationalized against existing sets? How will common dimensions be conformed? How does the consumption architecture affect the homogeneity of data sets being created? Curation: Depending on the source, data might be more or less reliable, but even lower confidence information can be extremely valuable in aggregate, as we’ve seen historically with third-party cookies. The more varied the sources contributing to a data asset, the greater the need for curation, cleansing, and scoring. Ask: How will the data be cleansed and groomed based on the consumer’s requirements? Will different “quality” or certification levels of the data be needed? Output: Organizations that view data as an asset prioritize sharing across business units and between tools. Consider implementing standards for data asset creation that take connectivity and interoperability into account. Ask: How will data be delivered? Will it include a semantic layer that can be consumed by visualization tools? Will the data asset feed into a more modern data marketplace where customers (end users) can shop for the data they need? Understanding: As a shared resource, data assets require standardized tagging to ensure maximum utility. Ask: How will metadata (technical and business) be managed and made available for consumers for these sets? How is the business glossary populated and managed? Access: To maintain legal and regulatory compliance and avoid costly mistakes, good governance requires access management. Ask: Who will have access to various delivered assets? Will control require row- or column-level security, and if so, what’s the most efficient and secure way to implement those controls? Explore tools that streamline data asset preparation In many organizations, the data asset lifecycle is no longer a linear journey, where all data proceeds from collection to analysis in an orderly progression of steps. With the advent of the data lake, the overall reference architecture for most companies now includes a “marshaling” or staging sector that allows companies to land vast amounts of data — structured, unstructured, semi-structured, what some have labeled collectively as “multi-structured” or “n-structured” — in a single region for retrieval at a later time. Data may later be consumed in its raw form, slightly curated to apply additional structure or transformation, or groomed into highly structured and validated fit-for-purpose, more traditional structures. Podium Data developed a useful metaphor when speaking of these three levels of data asset creation. “Bronze” refers to the raw data ingested with no curation, cleansing, or transformations. “Silver” refers to data that has been groomed in some way to make it analytics-ready. “Gold” refers to data that has been highly curated, schematized, and transformed suitable to be loaded into a more traditional data mart or enterprise data warehouse (EDW) on top of a more traditional relational database management system. To streamline the creation of assets at each of those levels, many organizations adopt self-service tools to ensure standard processes while democratizing asset creation. While the vendor landscape is wide in this area, the following three examples represent key functionality: Podium, like Microsoft and others, adopted a “marketplace” paradigm to describe developing data assets for consumption in a common portal where consumers can “shop” for the data they need. Podium provides its “Prepare” functionality to schematize and transform data residing in Hadoop for a marketplace type of consumption. AtScale is another Hadoop-based platform for the preparation of data. It enables the design of semantic models, meaningful to the business, for consumption by tools like Tableau. Unlike traditional OLAP semantic modeling tools, a separate copy of the data is not persisted in an instantiated cube. Rather, AtScale embraces OLAP more as a conceptual metaphor. For example, when Tableau interacts with a model created in AtScale on top of Hadoop, the behind-the-scenes VizQL (Tableau’s proprietary query language) is translated in real time to SQL on Hadoop, making the storage of the data in a separate instance unnecessary. Alteryx is also a powerful tool for extracting data from Hadoop, manipulating it, then pushing it back into Hadoop for consumption. Keep security in mind It is worthy to note that many self-service tools have a server component to their overall architecture that is used to implement governance controls. Both row-level security (RLS) and column-level security (sometimes referred to as perspectives) can be put in place, and implementations of that security can be accomplished many times in more than one way. Many of these tools can leverage existing group-level permissions and security that exist in your ecosystem today. Work with a consulting services partner or the vendors themselves to understand recommended best practices in configuring the tools you have selected in your environment. Whether you’re evaluating self-service data tools or looking for ways to shift your organization’s culture toward seeing data as an asset, we can help. Fusion’s team of data, technology, and digital experts can help you architect and implement a comprehensive data strategy, or help you get unstuck with a short call, workshop, or the right resources to reframe the questions at hand. Read about key considerations for data democratization >> Learn more about data strategy and how to get started >>
Almost 20 years ago, Capital One recognized the need for one person to oversee their data security, quality, and privacy, and the role of the Chief Data Officer was born. Now reports show that 68% of organizations have a CDO (Harvard Business Review, 2020). And while the role has become more common and has significantly evolved, many data executives are still struggling to get a seat at the table or bring data to the forefront of their organization. In fact, in a recent survey, only 28% of respondents agreed that the role was successful and established. Company leaders agree that there needs to be a single point of accountability to manage the various dimensions of data inside and outside of the enterprise, including the quality and availability of that data. But now we are at a crossroads — what is the best way to align the work that the CDO does with the strategy of the business as a whole? The reality is that CDOs often struggle to find the internal and external support and resources needed to educate others to align with the organization’s goals. Implementing enterprise data governance, data architecture, data asset development, data science, and advanced analytics capabilities — such as machine learning and video analytics — at scale, is not an easy task. To be successful, data executives need support, resources, and communities focused on the elevation of data. We are proud to continue to help these communities come to life for the benefit of our colleagues and clients, establishing local resources here in the Midwest with global scale, reach, and impact. Read on as Mark Johnson, our Executive Leader for Data Management and Analytics, provides insight on the current state of data and the CDO, and provides details on multiple opportunities for data leaders of different levels to get more involved in the data community. Q: How has the role of data changed/evolved for organizations? The reality is that information is everything. This global pandemic proved that to many organizations. For some, it showed that their digital network was ready, and they were aptly prepared to take on COVID. For others, it has forced them to recognize their own immaturity with data and analytics. On its own, managing data is not exciting — the information just sort of exists. To give data value, you have to put it to use. And so, I think we are going to see the Chief Data Officer and/or Chief Data Analytics Officer really find their own in the coming years. It’s time for their seat at the table. The C-suite is now asking questions that can only be answered with data, and now they truly understand both the value and consequences of the data game. Q: What do you think are the biggest challenges facing CDOs/data leaders today? I think that the biggest challenge for data executives today is the acquisition of talent that is seasoned and experienced where you need them to be for your organization. Higher education hasn’t necessarily kept up with the data world, and often times it takes additional training to reach the right levels. The reality is that right now the talent is manufactured in the real world. Data executives have to be connected and equipped to mentor, train, and keep the right people. Q: You’ve mentioned that data leaders need to connect with each other. What value can people expect from these data communities? I think there is tremendous value. As we are seeing the power of data evolve in organizations, and the role of data leaders evolve as well, I think coming together to collaborate and share elevates the leader, the organization, and the view of data as a whole. In these communities, it gives people a safe space to talk about how they are doing, what they are doing, what their biggest challenges are, and what solutions are working for them. These communities have truly become both a learning laboratory and an accelerator for data. Q: As a big proponent of connecting data leaders, you have been involved in creating different opportunities for people to get together. What groups/events would you recommend, and how can people get involved? I personally have been involved with the MIT Chief Data Officer and Information Quality Symposium (MIT CDOIQ), which is such a great opportunity to start with for connection. It has developed into additional opportunities for data leaders at all levels to get involved and create the kind of community we need to truly elevate the value of data. Organizations like the CDO Magazine, the creation of CDO roundtables across the nation, and the International Society of Chief Data Officers (isCDO) all evolved from connecting data leaders and identifying common challenges. MIT CDOIQ: The International MIT Chief Data Officer and Information Quality Symposium (MIT CDOIQ) is one of the key events for sharing and exchanging cutting-edge ideas and creating a space for discussion between data executives across industries. While resolving data issues at the Department of Defense, the symposium founder, Dr. Wang, recognized the need to bring data people together. Now in its 15th year, MIT CDOIQ is a premier event designed to advance knowledge, accelerate the adoption of the role of the Chief Data Officer, and change how data is leveraged in organizations across industries and geographies. Fusion has been a sponsor of this symposium for seven years now, and we are so excited to see how the event has grown. Designed for the CDO or top data executive in your organization, this is a space to really connect with other top industry leaders. CDO Roundtables Fusion has always been focused on building community and connecting people. And when one of our clients, a Fortune 500 retailer, mentioned wanting to talk with other data leaders from similar corporations, we realized that there was a big gap here — there was no space that existed where data leaders could informally come together, without sales pitches and vendor influence, and simply talk. That’s how the CDO roundtables were born — a place that allows data leaders to get to know each other, collaborate, accelerate knowledge growth, and problem solve. We just started two years ago in Cincinnati, but now, now we’ve expanded to multiple markets including Indianapolis, Columbus, Cleveland, Chicago, and Miami. These groups are designed for your CDO/CDAO and truly create an environment for unfiltered peer-to-peer discussion that helps solves data leadership challenges across industries. If you’re interested in joining one of these roundtables or starting one in your market, email me or message me on LinkedIn. I’m here and ready to get these roundtables started with executives in as many communities as I can. The more communities we have, the more data leaders and organizations we can serve. International Society of Chief Data Officers (isCDO) Launched out of the MIT CDOIQ symposium, the isCDO is a vendor-neutral organization designed to promote data leadership. I am excited to be a founding member of this organization, along with our Vice President of Strategy, David Levine. Our ultimate goal is to create a space that serves as a peer-advisory resource and enables enterprises to truly realize the value of data-driven decision making. With multiple membership options available, isCDO is the perfect opportunity for data leaders looking to connect with their peers and gain a competitive advantage by focusing on high-quality data and analytics. CDO Magazine I am really proud to be a founder of the CDO magazine, as it really is a resource for all business leaders, not just the CDO. We designed the magazine to be a resource for C-suite leaders — to educate and inform on the value proposition, strategies, and best practices that optimize long-term business value from investments in enterprise data management and analytics capabilities. Check out the publication here. And if you’re interested in contributing content or being interviewed, let me know at firstname.lastname@example.org. Closing: The role of the CDO is integral to organizations, but it’s still evolving. Now more than ever, it is important that data leaders come together to collaborate and problem-solve. Fusion is excited to be a part of each of these initiatives, and we are committed to being an agent of change in the communities we serve and beyond. By connecting global thought leaders we believe that organizations will realize the value of data to power their digital transformation. If you’re interested in joining any of these data communities or just have questions, feel free to reach out to Mark via email or on LinkedIn.
While every organization’s journey to digital transformation looks different, one thing remains the same — the importance of data. Tackling your data systems and processes is vital to fully transform. However, the reality is that most organizations are overwhelmed with data about their customers. But these troves of information are completely useless unless companies know that the data they have is accurate and how to analyze it to make the right business decisions. In today’s world, organizations have been forced to pivot and have realized the value data can bring to drive insight and empower their decision-making. However, many organizations have also recognized their data immaturity. So how do you move forward? The role of data in digital transformation Data can be your organization’s biggest asset, but only if it is used correctly. And things have changed. A lot of organizations have completed the first steps in their digital transformation, but now they are stuck — they aren’t getting the results they expected. Why? They haven’t truly leveraged their data. According to Forrester, “Firms make fewer than 50% of their decisions on quantitative information as opposed to gut feelings, experiences, or opinions.” The same survey also showed that while 85% of those respondents wanted to improve their use of data insights, 91% found it challenging to do so. So, now that you’ve got the data, how can you make it more valuable? Data strategy is key to your digital transformation With so many systems and devices connected, the right information and reporting is critical. But first, you have to make sure you have the right technology in place. Utilizing big data Although you might feel inundated with the amount of data you have coming in, using big data analytics can bring significant value to your digital transformation. Through big data analytics, you can get to a granular level and create an unprecedented customer experience. With information about what customers buy, when they buy it, how often they buy it, etc., you can meet their future needs. It enables both digitization and automation to improve efficiency and business processes. Optimizing your legacy systems Legacy systems are critical to your everyday business, but can be slow to change. Why fix what’s not necessarily broken? But just because systems are functioning correctly doesn’t mean they’re functioning at the level you need them to — a level that is conducive to achieving your data and digital transformation goals. This doesn’t have to mean an entire overhaul. You’ve likely invested a lot into your legacy systems. One key to a good data strategy is understanding how to leverage your legacy systems to make them a part of (instead of a roadblock to) your digital transformation. With the enormous scale of data so closely tied to applications, coding and deployment can often make this stage of your digital transformation feel overwhelming. Sometimes DevOps tooling and processes are incompatible with these systems. Therefore, they are unable to benefit from Agile techniques, continuous integration, and delivery tooling. But it doesn’t have to feel impossible — you just need the right plan and the right technology. Focusing on your data quality Even with the right plan and technology, you have to have the right data. Bad data can have huge consequences for an organization and can lead to business decisions made on inaccurate analytics. Ultimately, good data needs to meet five criteria: accuracy, relevancy, completeness, timeliness, and consistency. With these criteria in place, you will be in the right position to use your data to achieve your digital transformation goals. Implementing a data strategy with digital transformation in mind So how do you implement your data strategy? You should start by tackling your data engineering and data analytics. The more you can trust your data, the more possibilities you have. By solving your data quality problem, you can achieve trust in your data analytics. And then, the more data you have on your customers, the more effective you can make your customer experience. But, this all requires a comprehensive data strategy that allows your quality data to be compiled and analyzed so you can use it to create actionable insights. The biggest tools to help here — AI and machine learning. The benefits of a data-driven digital transformation The benefits of investing in your data are clear, including increased speed to market, faster incremental returns, extended capabilities, and easier access and integration of data. Discover more about the different ways you can invest in your data and improve and accelerate ROI for your organization. Ultimately, your goal is to elevate how you deliver value to your customers. Digital transformation is the key to understanding your customers better and providing a personalized customer experience for them. Leveraging your data can make all the difference between you and your competitors. And we’re here to help. Learn more about how some of our clients have benefited from investing in their data and digital transformation.
Getting the right data to the right people at the right time is the name of the game in today’s demanding marketplace. Every company has to find a way to harness big data and use it to drive growth. And if your organization isn’t talking big data, you are at a competitive disadvantage. This article covers a top-level view of big data’s evolution and key components. It can help you understand the importance of big data and technologies that are essential to its discussion. With this foundation, you can proceed to the next step — addressing what to do with your data and how. Just how much data exists? Every passing moment, the pace of data creation continues to compound. In the time it takes you to read these few paragraphs there will be: more than 200 million emails sent millions of dollars in e-commerce transacted 50 hours of YouTube videos uploaded millions of Google searches launched tens of millions of photos shared Every few minutes, this cycle repeats and grows. In 2019, 90% of the world’s digital data had been created in the prior two years alone. By 2025, the global datasphere will grow to 175 zettabytes (up from 45 zettabytes in 2019). And nearly 30% of the world’s data will need real-time processing. Over the last decade, an entire ecosystem of technologies has emerged to meet the business demand for processing an unprecedented amount of consumer data. What is big data? Big data happens when there is more input than can be processed using current data management systems. The arrival of smartphones and tablets was the tipping point that led to big data. With the internet as the catalyst, data creation exploded with the ability to have music, documents, books, movies, conversations, images, text messages, announcements, and alerts readily accessible. Digital channels (websites, applications, social media) exist to entertain, inform, and add convenience to our lives. But their role goes beyond the consumer audience — accumulating invaluable data to inform business strategies. Digital technology that logs, aggregates, and integrates with open data sources enables organizations to get the most out of their data, and methodically improves bottom lines. Big data can be categorized into structured, unstructured, and semi-structured formats. The development of modern data architecture Until recently, businesses relied on basic technologies from select vendors. In the 1980s, Windows and the Mac OS debuted with integrated data management technology, and early versions of relational database engines began to become commercially viable. Then Linux came onto the scene in 1991, releasing a free operating system kernel. This paved the way for big data management. What is big data technology? Big data technologies refer to the software specifically designed to analyze, process, and extract information from complex data sets. There are different programs and systems that can do this. Distributed file systems In the early 2000s, Google proposed the Google file system, a technology for indexing and managing mounting data. A key tenet to the idea was using more low-cost machines to accomplish big tasks more efficiently and inexpensively than the hardware on a central server. Before the Information Age, data was transactional and structured. Today’s data is assorted and needs a file system that can ingest and sort massive influxes of unstructured data. Open-source and commercial software tools automate the necessary actions to enable the new varieties of data, and its attendant metadata, to be readily available for analysis. Hadoop Inspired by the promise of distributing the processing load for the increasing volumes of data, Doug Cutting and Mike Cafarella created Hadoop in 2005. The Apache Software Foundation took the value of data to the next level with the release of Hadoop in Dec. 2011. Today, this open-source software technology is packaged with services and support from new vendors to manage companies’ most valuable asset: data. The Hadoop architecture relies on distributing workloads across numerous low-cost commodity servers. Each of these “pizza boxes” (so called because they are an inch high and less than 20 inches wide and deep) has a CPU, memory, and disk storage. They are simple servers with the ability to process immense amounts of various, unstructured data when running as nodes in a Hadoop cluster. A more powerful machine called the “name node” manages the distribution of incoming data across the nodes. By default, data is written to at least three nodes and might not exist in its entirety as a single file in any one node. Below is a simple diagram that illustrates the Hadoop architecture at work. Open source software The majority of enterprises today use open source software (OSS). From operating systems to utilities to data management software, OSS has become the standard fare for corporate software development groups. Serving as a progressive OSS organization, Apache Software Foundation is a non-profit group of thousands of volunteers who contribute their time and skills to building useful software tools. As the creators, Apache continuously works to enhance Hadoop code — including its distributed file system called Hadoop Distributed File System (HDFS) — as well as the code distribution and execution features known as MapReduce. Within the past few years, Apache released nearly 50 related software systems and components for the Hadoop ecosystem. Several of these systems have counterparts in the commercial software industry. Vendors have packaged Apache’s Hadoop with user interfaces and extensions, while offering enterprise-class support for a service fee. In this segment of the OSS industry, Cloudera, Hortonworks, and Pivotal are leading firms serving big data environments. Now software systems are so tightly developed to the core Hadoop environment that no commercial vendor has attempted to assimilate the functionality. The range of OSS systems, tools, products, and extensions to Hadoop include capabilities to import, query, secure, schedule, manage, and analyze data from various sources. Storage Corporate NAS and SAN technologies, cloud storage, and on-demand programmatic requests returning JSON, XML, or other structures are often secure repositories of ancillary data. The same applies to public datasets — freely available datasets, in many cases for economic activity by industry classification, weather, demographics, location data, and thousands more topics. Data of this measure demands storage. Distributed file systems greatly reduce storage costs while providing redundancy and high availability. Each node has its local storage. These drives don’t require speed or solid-state drives, commonly called SSDs. They are inexpensive, high-capacity pedestrian drives. Upon ingestion, each file is written to three drives by default. Hadoop’s management tools and the Name Node monitor each node’s activity and health so that poorly performing nodes can be bypassed or taken out of the distributed file system index for maintenance. The term “data lake” describes the vast storage of different types of data. These vastly different data sources are in a minimum of a dozen different file formats. Some are compressed or zipped. Some have associated machine data, as found in photos taken with any phone or digital camera. The date, camera settings, and often the location are available for analysis. For example, a query to the lake for text messages that included an image taken between 9 p.m. and 2 a.m. on Friday or Saturday nights in Orlando on an iPhone would probably show fireworks at Disney World in at least 25% of the images. Administration of big data initiatives The enterprise administration of applications — their storage requirements, security granularity, compliance, and dependencies — required Hadoop distributions to mature these capabilities in the course of becoming a managed service to an enterprise (like those from Cloudera and Hortonworks). In the graphic above, you can see a view of Hadoop’s place among other software ecosystems. Note that popular analysis tools (below) are valuable in developing big data solutions: Excel and Tableau databases such as SQL Server and Oracle development platforms such as Java and Informatica Data Quality Administration through Cisco and HP tools is common. Further development of big data Commercial software companies have begun connecting to Hadoop, offering functionality such as: data integration quality assessment context management visualization and analysis from companies such as IBM, Microsoft, Informatica, SAP, Tableau, Experian, and other standard carriers. Analytics and data science Analytics is the endgame for developing a big data environment. The rise of big data has given credence to a new resource classification, the data scientist — a person who embodies an analyst, technologist, and statistician all in one. Using several approaches, a data scientist might perform exploratory queries using Spark or Impala, or might use a programming language such as R or Python. As a free language, R is rapidly growing in popularity. It is approachable by anyone who is comfortable with macro languages such as those found in Excel. R and its libraries implement statistical and graphical techniques. Moving to the cloud Cloud computing is very different from server-class hardware and software. It involves cloud storage, multi-tenant shared hosts, and managed virtual servers that are not housed on a company’s premises. In cloud environments, an organization does not own equipment, nor does it employ the network and security technologists to manage the systems. Cloud computing provides a hosted experience, where services are fully remote and accessed with a browser. The investment to build a 10- or 20-node HDFS cluster in the cloud is relatively small compared to the cost of implementing a large-scale server cluster with conventional technologies. The initial build-out of redundant centers by Amazon, Microsoft, Google, IBM, Rackspace, and others has passed. We now have systems available at prices below the cost of a single technician. Today, cloud computing fees change rapidly with pricing measured by various usage patterns. Conclusion: The rise of big data is evident Big data is not a fad or soon-to-fade trending hashtag. Since you began reading this article, more than 100 million photos have been created, with a sizeable portion having a first-degree relationship to your industry. And the pace of data creation continues to increase. The distribution of computing processes can help organizations to gain a 360-degree view of their customers through big data collection and analysis. And companies that embrace big data technologies and solutions will rise ahead of their competitors. Big data technologies are becoming an industry standard in finance, commerce, insurance, healthcare, and distribution. Welcoming big data technologies and solutions is key in the optimization and continued growth going forward. Companies that embrace data solutions can continue to improve management and operational processes and create a competitive advantage to withstand an ever-evolving marketplace.
Executive summary The credit card industry is becoming more complex. Advanced loyalty, targeted offerings, unclear rate conditions, and many other factors can often make it difficult for banks to identify the right customer. Ultimately, the financial services firms that will succeed in this environment will engage the right customers with the right message at the right time. Market leaders will be those who can accurately forecast the revenue and risk for each prospective and existing customer. While the credit card environment has changed, the analytics and modeling techniques have largely remained the same. These models are highly valuable, but do not offer flexibility to evaluate granular and complex customer behaviors incumbent in a financial services firm’s data and other public and private data sets. Machine learning and deep learning (collectively, machine learning) change the paradigm for predictive analytics. In lieu of complex, expensive, and difficult to maintain traditional models, machine learning relies on statistical and artificial intelligence approaches to infer patterns in data, spanning potentially billions of available patterns. These insights, not discoverable with traditional analytics, may empower the financial industry to make higher-value, lower-risk decisions. In this brief article, we discuss three potential opportunities that Fusion expects should add high value to the financial services industry. Advanced analytics for banking Machine learning uncovers patterns in complex data to drive a predictive outcome. This is a natural fit for the banking industry as firms are often working with imperfect information to determine the value of incoming customers. How it works: Traditional models vs. machine learning Credit scorecards represent the basis of most credit card issuance decision making. Whether a firm leverages off-the-shelf models or applies bespoke modeling, Fusion expects the following is representative of a credit scorecard: In the aggregate, these models are highly valuable. But on a per-applicant basis, patterns and details are lost. In machine learning, we can explore detailed and expansive public and private data about segmented applicants for marketing purposes in real time. For example, we can supplement our existing models with data that can be used to segment potential customers such as: Regional FICO trends Educational attainment Social media sentiment analysis Mortgage and equity analysis Much, much more Machine learning can apply artificial neural networks to uncover patterns in your applicants’ history across millions of data points and hundreds of model statistical training generations. When detecting these patterns, machine learning models can uncover risk in approved applicants and value in sub-prime applications. For example, by exploring existing customers, machine learning could potentially reveal that applicants with low FICOs but high educational attainment for a specific city suburb have historically resulted in minimal write-offs. Conversely, a potentially high FICO applicant may have recently moved into a higher-net-worth neighborhood, requiring a high expenditure on a financial institution’s credit lines, resulting in repayment risk. Ultimately, your customer data can tell a far richer story about your customers’ behavior than simple payment history. Machine learning opportunities Financial services firms can gain more insight and capitalize on the benefits of machine learning by applying their marketing dollars towards customers who are more likely to fit within their desired financial portfolio. Lifetime customer value for customer with limited credit data Currently, credit score is determined based on traditional data methods. Traditional data typically means data from a credit bureau, a credit application, or a lender’s own files on an existing customer. One in 10 American consumers has no credit history, according to a 2015 study by the Consumer Financial Protection Bureau (Data Point: Credit Invisibles). The research found that about 26 million American adults have no history with national credit reporting agencies, such as Equifax, Experian and TransUnion. In addition to those so-called credit invisibles, another 19 million have credit reports so limited or out-of-date that they are unscorable. In other words, 45 million American consumers do not have credit scores. Through machine learning models and alternative data (any data that is not directly related to the consumer’s credit behavior), lenders can now directly implement algorithms that assess whether a banking firm should market to the customer segment, thereby assigning customer risk and scores, even to credit invisibles (thin-file or no-file customers). Let’s look at a few sources of alternative data and how useful they are for credit decisions. Telecom/utility/rental data Survey/questionnaire data School transcript data Transaction data – This is typically data on how customers use their credit or debit cards. It can be used to generate a wide range of predictive characteristics Clickstream data – How a customer moves through your website, where they click and how long they take on a page Social network analysis – New technology enables us to map a consumer’s network in two important ways. First, this technology can be used to identify all the files and accounts for a single customer, even if the files have slightly different names or different addresses. This gives you a better understanding of the consumer and their risk. Second, we can identify the individual’s connections with others, such as people in their household. When evaluating a new credit applicant with no or little credit history, the credit ratings of the applicant’s network provide useful information. Whether a bank wants to more efficiently manage current credit customers or take a closer look at the millions of consumers considered unscorable, alternative data sources can provide a 360° view that provides far greater value than traditional credit scoring. Alternate data sets can reveal consumer information that can increase the predictive accuracy of the credit scores of millions of credit prospects. This allows companies to target consumers who may not appear to be desirable because they have been invisible to lenders before, which can lead to a commanding competitive advantage. ON-DEMAND WEBINAR: Learn how to turn data into insights that drive cross-sell revenue Optimizing marketing dollars to target customers Traditional marketing plans for credit card issuers call to onboard as many prime customers that meet the risk profile of the bank. However, new customer acquisition is only one piece of the puzzle. To drive maximum possible profitability, banks can consider not only the volume of customers, but also explore the overall profitability of a customer segment. Once these high-value customer segments are identified, credit card marketers can tailor specific products to these customer segments to deliver high value. Machine learning can assist both in the prediction of total customer value, as well as the clustering of customers based on patterns and behaviors. Identifying high-risk credit card transactions in real time Payments are the most digitalized part of the financial industry, which makes them particularly vulnerable to digital fraudulent activities. The rise of mobile payments and the competition for the best customer experience push banks to reduce the number of verification stages. This leads to lower efficiency of rule-based approaches. The machine learning approach to fraud detection has received a lot of publicity in recent years and shifted industry interest from rule-based fraud detection systems to machine-learning-based solutions. However, there are also understated and hidden events in user behavior that may not be evident but still signal possible fraud. Machine learning allows for creating algorithms that process large datasets with many variables and helps find these hidden correlations between user behavior and the likelihood of fraudulent actions. Another strength of machine learning systems compared to rule-based ones is faster data processing and less manual work. Machine learning can be used in few different areas: Data credibility assessment – Gap analytics help identify missing values in sequences of transactions. Machine learning algorithms can reconcile paper documents and system data, eliminating the human factor. This ensures data credibility by finding gaps in it and verifying personal details via public sources and transactions history. Duplicate transactions identification – Rule-based systems that are used currently constantly fail to distinguish errors or unusual transactions from real fraud. For example, a customer can accidentally push a submission button twice or simply decide to buy twice more goods. The system should differentiate suspicious duplicates from human error. While duplicate testing can be implemented by conventional methods, machine learning approaches will increase accuracy in distinguishing erroneous duplicates from fraud attempts. Identification of account theft, unusual transactions – As the rate of commerce is growing, it’s very important to have a lightning-fast solution to identify fraud. Merchants want results immediately, in microseconds. We can leverage machine learning techniques to achieve that goal with the sort of confidence level needed to approve or decline a transaction. Machine learning can evaluate vast numbers of transactions in real time. It continuously analyzes and processes new data. Moreover, advanced machine learning models, such as neural networks, autonomously update their models to reflect the latest trends, which is much more effective in detecting fraudulent transactions. Summary Bottom line: machine learning can leverage your data to develop patterns and predictions about your customers and applicants. These machine learning models are typically simpler to develop and deploy and may be more efficacious than traditional financial services modeling. These models also enable a more detailed forecast about your customers, allowing you to reduce risk while targeting more profitable customers through their lifetime with your credit card services. Related resources Case study: Machine learning predicts outcomes in financial services Case study: How Donatos uses machine learning to retain customers 5 tips to keep the wealth in your company Fusion Alliance has extensive experience in the financial services industry and serves as a preferred solutions provider for many prominent financial services institutions, including Fortune 500 firms. If you’d like to discuss your organization, let us know.
Internet users produce an estimated 2.5 quintillion bytes of data each day. Yes, that’s quintillion — as in a one followed by 18 zeroes. That’s a mind-boggling amount of data. Yet, every day, that information is mined, analyzed, and leveraged into usable insights that businesses then use to streamline operations, assess risks, track trends, reach a specific target audience, and so much more. Big data, the term we use to describe this vast amount of information, is a goldmine for industries seeking to increase revenue and improve operations. But without a solid strategy for how to use that data, you could scour the internet until the end of time and still not see any gains. Before you dive in to the big datasphere, it’s best to familiarize yourself with what a big data strategy looks like. Then, you can take measured steps to ensure your vision is properly focused and ready to deliver the value you need. What is a big data strategy? A big data strategy is exactly what it sounds like: a roadmap for gathering, analyzing, and using relevant industry data. Regardless of business vertical, an ideal big data strategy will be: Targeted. You can’t hit a moving target, let alone one that’s too nebulous to define. Drill down to the details until stakeholders are aligned on the business objectives they want to reach through your big data strategy. Actionable. Data can be insightful without necessarily being actionable. If your big data strategy doesn’t serve up information usable by the broader team while paving the way for next steps, it likely won’t be beneficial in the long run. Measurable. As with any other business plan, a big data strategy needs to be measurable to deliver lasting success. By measuring your incremental progress, you can refine your strategy along the way to ensure you’re gathering what you need and assessing it in a way that serves your goals. What’s the best way to approach a big data strategy? Now that we’ve covered the basics of what a successful big data strategy entails, let’s turn to how your organization might put one into practice. As we’ve worked with clients across industries, we’ve seen the following six steps deliver wins. Your big data strategy will likely require unique details, but this action plan gives you a starting point. 1. Gather a multi-disciplinary team Big data is not solely an IT project; it’s a business initiative. The team should have more representatives from business departments than from the corporate technology group. Members typically include knowledgeable staff or managers from finance, business development, operations, manufacturing, distribution, marketing, and IT. The team members should be familiar with current reports from operational and business intelligence systems. A common thread? Each team member brings ideas about performance indicators, trend analysis, and data elements that would be helpful to their work but which they don’t already access. More importantly, they know why having that information readily available would add value — not only for their business units, but for the organization as a whole. 2. Define the problem and the objectives What problem should be analyzed? What do you hope to achieve through your strategy? Take three problems you’d like to have solved and formulate them into questions. Limit yourself to three, to start. There will always be more questions to answer. Don’t try to tackle them all at once. Write those questions as the subject line on three emails. Send them to all members of the multidisciplinary team. The replies will guide your efforts in narrowing (or expanding) the initial scope of study. Here are a few questions to get the ball rolling: What do you want to know (about your audience, your processes, your revenue streams, etc.)? Which factors are most important for increasing margin on a given service or product? How much does social media reflect recent activity in your business? Which outcomes do you want to predict? Developing a 360-degree view of all customers in an enterprise may be too ambitious for an initial project. But finding the characteristics of commercial customers who have bought products from multiple lines of business in five key geographic markets might be a more manageable scope right out of the gate. With this approach, iterations in development provide expansion to all lines of business or to all markets in cadence with a company’s business pace. 3. Identify internal data sources Before getting into the technical weeds, you need to know what data exists internally from a functional viewpoint. Gap analysis will uncover incomplete data, and profiling will expose data quality issues. Your first step is just to identify what usable data you have. If customers for one line of business are housed in an aging CRM, and customers for a newer line of business are found in a modern system, a cross-selling opportunity analysis will point out the need to integrate those data sources. Do you have an inventory of data sources written in business language? In forming a strategy, a team will want to have references, such as vendor contracts, customer list, prospect list, vehicle inventory, AR/AP/GL, locations, and other terms that describe the purpose or system from which the data is derived. The list can be expanded for technologists later. Learn how to develop data as an asset >> 4. Find relevant external data sources If you don’t have enough data internally to answer your questions, external data sources can augment what you do have. Public data sites like Data.gov, the U.S. Census Bureau, and the Department of Labor Statistics’ Consumer Price Index have a vast amount of information available to anyone who can operate a search function. Data.gov alone has over 100,000 datasets, some containing millions of rows covering years and decades. Social media is another invaluable source of data. Regardless of industry, Twitter, Facebook, and Pinterest posts may have a greater impact on your operation than you realize. Be sure that a couple of members of the team pursue data from social media sources to include in the initial study. 5. Develop an organizational system One of the most important elements of a big data strategy is organizing the data you collect. Whether it’s analytics dashboards or full-blown data fabric systems, you’ll need a way to organize data in order to analyze it. Decide how and where you want the data to live, how it can be accessed, and who will have access to it. Remember that the more you democratize data, the more your team grows comfortable with reading and handling this information, and the more insight you can glean. However, this also means you’ll need a strong system of management to ensure the data is secure. 6. Get experienced guidance Engaging an experienced team that has led others through data strategy and implementation can help you jump-start your strategy. An external resource skilled in big data management can provide your company with a smooth progression through the many tasks at hand. Your guide should have extensive knowledge of business data elements, or BDEs, which are key to creating understandable and cross-company analytical outputs, including reports, charts, graphs, indicators, and other visualizations. Seek guidance especially if your organization doesn’t have a data glossary, network administration, or knowledge of new technologies, as implementing these can be highly technical and time-consuming. Planning your big data strategy Planning a big data strategy will require you to rethink the way you manage, operate, and analyze your business. But with the right guidance and tools you can develop an effective strategy that positions your company for growth and success. Need a guide on the path to creating your big data strategy? We’re here to help. Reach out to an expert to learn more about how you can leverage big data for your business. Discover our strategic data management services >>
The future looks rosy for companies who take advantage of what strategic data management can do. But the specter of needing a team of people handling on-premises hardware and the cost implications of doing so continues to make organizations hesitant to move forward with a new data strategy. Here are a handful of factors to consider when weighing the costs versus benefits of implementing a big data strategy in your organization. 1. Compare the dollars and cents In 2012, I conducted a study that compared the cost of managing data with traditional data warehousing assets, such as Oracle, to the cost of managing that same data with an open-source software framework, such as Hadoop. At the end of the day, including a 60% discount off list price for the hardware and software licenses for Oracle, the cost to manage 1 terabyte in a 16 terabyte configuration with traditional assets was $26,000 per terabyte compared to $400 per terabyte with an open-source framework. 2. Analyze the total cost of ownership The reason there wasn’t a mass exodus in 2012 from Oracle to Hadoop was because you have to consider the total cost of ownership. You have to ask, “Does my organization have the skills to manage this new technology environment? Is my existing Business Objects universe investment compatible with the back end?” In 2012, the answer was no. Today, you can connect your existing Business Objects universe investment to Hadoop on the back end. Then, take all that data out of Oracle, expose it through HIVE tables where it can be accessed, and enable the environment to perform even faster than it can perform in Oracle for pennies on the dollar. Pennies! Why wouldn’t you do that? 3. Evaluate the competitive advantage It goes something like this, “Well, if my competitor is running their data warehouse for $4 million a year on a legacy technology stack, and I can ‘lift and shift’ my data warehouse to a technology stack that I can run for $40,000 a year, who’s going to gain a competitive advantage?” 4. Assess the value of a 360-degree view of your customer In the TV series, “How to Get Away with Murder,” a forensic analysis of a suspect’s cell phone data that was backed up to his computer is performed. The other data is provided by the telecom provider. Because of the GPS service on the suspect’s phone, the detectives were able to identify his entire route from one state to another, how much time he spent when he was in motion, how much time he spent when he stopped, when he started again, and how many minutes his phone was in a particular location. They were able to create a geospatial plot of his path, all using the data stream from his mobile phone as he was driving his car with his phone on his person. This brings us to another important point when we think about data today. We’re living in a world of mashups. There’s opportunity to go out and subscribe to a Twitter feed and mash that up with an email address linkage in a way that would identify my behavior and thought processes. All that lives in the Twitter space or in my Facebook posts can be analyzed. Mashing up these many sources of data into a mega-analytic platform capability has become something that is easy to accomplish, but not if you don’t have a strategy for how you’re going to manage the data. Sam Walton’s objective with his fledgling Walmart stores was to always know what the customer wanted to buy and always have it on the shelves when he or she walked into the store. Back in the 1980s, Walmart used Teradata technology to build a database to collect all of the point-of-sale data, which was then used to calculate how many units they would need to ship to each store so they wouldn’t have to carry a surplus of inventory. The rest is history. The database actually became much more valuable to Walmart than the inventory carrying costs problem they solved using it. And now Walmart is a half-trillion-dollar a year global company. 5. Gauge the payoff of higher-end analytics Amazon is another huge data success story. As you know, they started as an online bookseller and didn’t make much money selling books online. But what they were able to do is get consumers to go to their portal and interact and leave data behind. They were very successful in leveraging that data, and from that data, they have grown into a company with over $100 billion dollars in sales. And now, of course, Amazon sells everything. Amazon is using the highest-end analytics, called predictive analytics. In fact, they recently filed for a patent on an analytic model that can predict what you’re going to buy before you buy it. Predictive analytics tells them there’s a pretty good chance that you’re going to purchase a product in the next 24-48 hours. They’re so confident in the accuracy of their algorithm that they will ship you that product before you even buy it. Let’s say something from Amazon shows up on your doorstep that you didn’t order, but it’s something that you wanted. Then you’ll pay for it. This isn’t yet a production feature of amazon.com, but keep your eye on the bouncing ball! The future of big data strategies and strategic data management The future belongs to companies whose data game is completely integrated into the foundation of how they do business in the marketplace. And because companies like Amazon know so much more and their revenue is so diverse and their ability to manage data is so significant, they are now even in the data hosting and data enrichment services business. They are selling their data and hosting apps in an infrastructure that exists because of their desire to manage data and ability to do it effectively. If you look at where the venture capital partners are investing their money today, you’ll see that it’s in companies who are busy creating that layer of integration between the front end and the back end because they have determined that the benefits of having a big data strategy greatly outweigh any costs.
Recently, our team was on a call with a client who was trying to consolidate dozens of transactional systems into a single model to support a more effective reporting paradigm. The envisioned solution focused on self-service, visual analytics, while also supporting more traditional reporting. This client’s challenges were similar to what many other businesses face today. They wanted: Quicker time to insight Empowered end users Lessened dependency on IT Reduced reconciliation of reports, etc. Sound familiar? The client wasn’t questioning whether or not there was value in the project ahead. Their questions were focused on the best approach. Do we pursue a big bang approach or pursue something more agile in nature? Upon further discussion and reflection, the objectives of the program seemed to be a perfect case for agile. Let’s talk about why. Iterative selling of value While the client knew the value of the project, we discussed how, in reality, data projects can die on the vine when the value isn’t apparent to the business funding the initiative or to the IT executives who need to demonstrate their operational ROI. As such, the ability to demonstrate value early and often becomes critical to building and keeping the momentum necessary to drive projects and programs across the finish line. Project sponsors need to constantly sell the value up to their management and across to the ultimate customer. Iterative wins become selling points that allow them to do so. Know your team’s delivery capability To truly understand what can be delivered (and by when) means accurately assessing how much work is in front of you and how quickly your team can deliver with quality. This example project was as new as the client’s team. For them, the most logical approach was to start doing the work to learn more about the work itself as well as the team. After a few iterations, the answers to the following questions become clearer: Parametric estimating – How do I estimate different complexities of types of work or data sources? How do I define the “buckets” of work and associate an estimate with each? What values do I assign to each of these buckets? Velocity – How quickly can my team deliver with each iteration? How much work can they reliably design, build, and test? Throttling – What factors can I adjust to predictably affect velocity without compromising quality or adversely affecting communication? Continuous improvement – Fail fast, learn fast, adapt. Do I understand what factors are impeding progress that I can influence? What are we learning about and how are we accomplishing the work so we can improve going forward? How do we get better at estimating? Team optimization – Do I have the right players on the team? Are they in the right roles? How does the team need to evolve as the work evolves? Foster trust – ensure adoption Anyone who relies on data, whether they are business or IT, has their go-to sources that they rely on. Getting an individual to embrace a new source for all of their information and reporting needs requires that the new source be intuitive to use, performant, and above all, trustworthy. As with any new solution, there will be skepticism within the user community, and whether conscious or not, an unspoken desire to find fault in the new solution, thereby justifying staying with the status quo. Data quality and reliability can be the biggest factor that adversely impacts adoption of a new data solution. By taking an agile, iterative development approach, you expose the new solution to a small group initially, work through any issues, then incrementally build and expose the solution to larger and larger groups. With each iteration, you build trust and buy-in to steadily drive adoption. Generate excitement By following an iteratively expansive rollout, genuine excitement about the new solution can be fostered. As use expands, adoption becomes more a result of a contagious enthusiasm rather than a forced, orchestrated, planned activity. Tableau’s mantra for many years has been “land and expand” — don’t try to deploy a solution all at one time. Once people see a solution and get excited about it, word will spread, and adoption will be organic. Eliminate the unnecessary While there are many legitimate use cases for staging all “raw” data in a data lake, concentrating on the right data is the appropriate focus for self-service BI. The right data is important for ensuring the performance of the semantic model, and it’s important for presenting the business user with a model that remains uncluttered with unnecessary data. Agile’s focus on a prioritized set of user stories will, by definition, de-prioritize and ultimately eliminate the need to incorporate low priority or unnecessary data. The result is the elimination of wasted migration time and effort, a reduced need for the creation and maintenance of various model perspectives, and ultimately quicker time to insight and value. Adjust to changing requirements and priorities Finally, it’s important to understand that data projects and programs focused on enabling enhanced or completely changed reporting paradigms take time to implement, often months. Over the time period, priorities will likely change. An agile approach allows you to reprioritize with each iteration, giving you the opportunity to “adjust fire” and ensure you’re still working on the most important needs of the end-users. Ready to roll out a successful self-service business intelligence program and not sure where to start? If you’re ready to take the next step, we’re here to help.
A massive storm is brewing in the banking, financial services, and insurance industries, and when it strikes, it will be devastating to the unprepared. That storm is the unprecedented transfer of wealth, $3.9 trillion worth, that will be passed from the hands of older generations to younger in the next eight years or so. The rains have already started to trickle, but when they come in full force, if your organization hasn’t already connected with younger generations, you’ll see millions of dollars in wealth walk right out your door. If your bank doesn't have a plan in place for customer retention, it’s not too late to take action. Consider that millennials (born circa 1981-1997, also called Gen Y), are now the largest generation, accounting for over 25% of the population. They are followed by Gen Z (born circa 2000-present), those born with digital devices in their hands, who comprise more than 20% of the population. The potential purchasing power of these generations combined is something that can make or break banks, wealth management firms, and insurance companies. Yet most businesses in these industries still don’t have a game plan to connect with an entire population. Will your company be different? The problem is complex, but no matter where you stand, a solution is within your reach if you create a strategy informed by data and insights that has a clear road map to success. Here are five tips to building successful customer retention strategies for your bank, so you can emerge strong on the other side of the impending wealth transfer. 1. Understand the challenges of banking for millennials Recognize that this is a whole new audience you’re dealing with. The old ways won’t work in the new economy of connected consumerism. A 360-degree view of your current customers will help you gain insights into what the older generation wants, but with an eye towards the future consumers of your brand. They’re not like baby boomers (born 1946-1964) or Generation Xers (born 1965-1979). This newer generation sees things differently than their parents and grandparents did. Get to know this younger audience on their terms and understand why they have different belief and value systems, why they view traditional institutions skeptically. Examine the world from their eyes. They’ve seen that industry giants who their elders once perceived as invincible (e.g., Lehman Brothers) are now gone. Or that others, like Wells Fargo, AIG, and Countrywide, had to be rescued by the government from the brink of bankruptcy, with taxpayers footing the bill. They’ve seen the effects of parents being laid off after years of loyal service to a corporation. They know families who lost their homes when the housing bubble burst. Can you blame them for being leery of traditional institutions? An Androit Digital survey examining millennials’ brand loyalty reported that 77% said they use different criteria to evaluate brands than their parents do. Are you aware of what criteria they are using to evaluate your brand? If not, you need to arm yourself with answers. Research shows that younger generations frequently turn to friends, independent online research, reviews, and social media for decision making. For example, an astounding 93% of millennials read reviews before making a purchase, and 89% “believe friends’ comments more than company claims,” according to an IBM Institute for Business Value survey. Your future hinges on understanding these behaviors. A report by Gallup on the insurance sector revealed, “Millennials are more than twice as likely (27% vs. 11% respectively) as all other generations to purchase their [insurance] policies online rather than through an agent.” Online purchasing is far from the mainstream among insurance consumers overall: “74% originally purchased with an agent vs. 14% online – but if this trend among millennials continues to grow, it could substantially change the way insurance companies interact with customers in the coming years,” the report stated. Likewise, “Banks are losing touch with an entire dominant generation,” according to Joe Kessler, president of Cassandra Global. The Cassandra Report cited that 58% of young adults said they would rather borrow money from friends or family instead of a traditional institution. Two-thirds of the respondents said it is “hard to know where to learn about what financial services they might need.” In other words, when it comes to banking, millennials don’t know who to trust. Begin the process of getting to know this younger clientele by conducting research that will help you gain insights into what they stand for, how and where they interact, and what their expectations are of your industry, your company, and your brand. By evaluating that data, you will be able to set the process for communicating with and building different ways to engage with these new young consumers. Your interactions and communications must be seamless and easy and reflect that you can talk in their terms. You’ll need to look at this emerging demographic with a “digital lens” because this is how millennials engage with brands. What are those channels, what are their preferences? What other services can you make available in a seamless and frictionless and customized way? If you don’t take the time to get to know your audience, you won’t be able to lay the foundation for a successful strategy to engage them. 2. Make young customer retention your bank’s primary mission Younger generations, millennials especially, are driven by a different set of values. They want a work/life balance. They like to donate money. They don’t want a lot of stuff. They like to travel. They want to experience life. They question long-standing rules that don’t make sense to them. So, develop your business strategy around a purpose or a mission – one that they will connect with. Build upon the information you learned about your younger customers in tip #1, then map this customer’s journey with behavioral analytics. Evaluate the digital channels and content that your younger clients find compelling. Now you can create a strategy and roadmap to engage these customers. 3. Build your customer experience for different audiences A strong customer experience (CX), one that creates loyalty, is one that is personalized, timely, relevant, appropriate, and built on trust. The more customizable the user experience, the better. According to Janrain, 74% of online users are frustrated with brands that provide content that doesn’t reflect their personal interests. You know users want to be recognized on their terms, but you have a problem. How do you build a single CX that addresses vastly different generations with different behaviors and interests? Is there a way to reconcile their differences via a single CX? The answer is no. For the time being, you need to develop both. If someone tells you differently, beware. Think about it. In wealth management, banking, and insurance, the older generation still holds the money and keeps the lights on for your business. The newer generation will get that money within a decade, but if you go full-throttle and build a single, mobile-first CX, you’re going to alienate the people holding the purse strings. In the next few pivotal years, your bank’s customer retention will be heavily dependent on how well you address each audience on their own terms. How to cater to older generations Older folks prefer offline channels, like walking into a branch, agency, or brokerage firm. They like to do business face to face or via phone conversations with tellers, bankers, agents, and wealth advisors. Online, they like having a “control panel” style experience on a desktop, such as you might find with financial trading platforms. This is how you build trust and timely, relevant, personalized experiences. Online, build a web portal to appeal to the interests, needs, and communications preferences of the older generation. The younger generation will use the web portal now and then, but that is not going to be the experience they associate with your brand – because you’ll give them their own. How to cater to younger generations Give the younger generation mobile apps and SMS communications. With over 87% of millennials saying they are never without their phone, this is where you should reach them. They have no interest in stepping foot in a building that feels like an institution or talking to some random agent, broker, or salesperson when they can do everything quickly and effortlessly on a mobile device. Take the information you learned in tips #1 and #2 and build strong loyalty, providing timely, relevant, personalized, and appropriate experiences on a digital dimension. As you build a CX specifically tailored to banking for millennials, you’ll find you can gain loyalty on their terms because you’ll be able to interact in a more agile, nimble, and personalized way. The older generation will probably use the mobile app for simple tasks like checking information and balances, but they’re going to associate their comfort with your brand with the CX they use most – the desktop. Two CXs could be the right solution for today’s transitioning market, but keep in mind that there are additional channels through which you can build loyalty with these younger audiences across the digital landscape. For example, you can share educational, informative content through social media channels. 4. Knowledge transfer to the younger generation Everyone in wealth management, insurance, and financial services already has a foot in the door with the younger generation. That connection is the strong relationship between existing older customers and their offspring. Leverage it. First, understand that the older generation wants to take care of the younger ones by leaving money to them, but they are worried that the next generation doesn’t have the knowledge or discipline to hold onto and grow that money. There are so many stories of young people, like athletes or celebrities, getting rich quickly, getting bad advice about money, and then squandering it all away. What if their children make the same mistakes? Help address that fear and protect those kids by arming your older customers with educational tools on how to prevent this from happening. For this CX, you’ll need to develop portals and educational content, manage and market that content, and make it come to life in an updated website (geared to the older generation) that features whitepapers, articles, or videos, such as “Talking to Your Children About Money 101” and the like. Educate this audience on how to talk about the benefits of insurance or long-term investment strategies and provide them with incentives to set up meetings with themselves, their offspring, and you. The younger generation isn’t interested in talking to an institution, but they will listen to the advice of the parent or grandparent giving them this money. Let the parents and grandparents have meaningful conversations that hold much more weight than your business sending a bulk email to junior that says, “Invest in an IRA.” Now when members of the younger generation, the recipients of transferred wealth, decide to check out your company on the advice of their parents or grandparents, they will access your relevant app that speaks their language and addresses things of interest to them. They’ll soon figure out that you’re not some stodgy institution and will be much more open to a discussion when their parents suggest a conversation with your company’s brokers, advisors, or agents. This is how the knowledge transfer will occur organically, growing your bank’s customer retention along the way as you build a relationship of loyalty and trust. You not only will give the benefactors peace of mind that their offspring will be good stewards of their fortune when the time comes, but you’ll keep the money in-house because you took time to connect with and earn the trust of the young beneficiaries. 5. Make use of emerging technologies in banking to satisfy the ever-changing digital landscape At this point, you know you could benefit from two CXs. The web platform focuses on the needs and concerns of the older generation that holds the wealth today. The mobile platform addresses the younger person who will inherit the wealth, providing guidance, teaching the basics of how to invest or buy insurance, and will be chock full of quizzes, games, personalized spreadsheets, automated tools, and more. The challenge is that when the older generations pass on, the desktop experience will be moot. You don’t want to have to rebuild all the technology infrastructure that you worked so hard to establish. The answer? Don’t build applications or tools – build platforms for the future that can be adapted as the younger generation takes over and as mobile-first interactions become predominant five years from now. Don’t overlook the fact that more cost-effective emerging technologies in banking, such as infrastructure in the cloud, will be a necessary ingredient for success. Banks and insurance companies are reluctant to get in the cloud, but if you understand that most applications are going to be in the cloud five years from now, you understand the critical nature of developing these capabilities today. The cloud enables rapid changes to meet market and customer demands. It is flexible and nimble. You pay for what you use, can pay for service or infrastructure, and simultaneously increase security and reliability. To those unfamiliar with the cloud, security can be a scary proposition. However, with major cloud providers like Microsoft and Amazon employing an army of experts to ensure security and regulatory compliance, the cloud is safer from a security standpoint than most on-premises data storage. While 85% of companies using the cloud report they are confident their providers are able to provide a secure environment, 90% of IT managers reported they are not confident in their own companies’ ability to detect security problems internally. If you’re building a flexible technology platform with the right digital CXs, infrastructure that looks to the future and cloud capabilities, then your organization will be positioned for success when the wealth transfer hits in the next decade. Final thoughts on customer retention strategies for banks There are more than 75 million millennials out there spending $600 billion every year, and that number is only going to increase. They are graduating from college with massive amounts of debt, face a precarious job market, and are typically naïve about financial matters and insurance.The companies who aggressively work to offer practical tools and advice on banking for millennials are the ones who will outperform their competition in the future. It’s not too late, but you cannot wait to take action. If a business does not begin building the bridge between current wealth owners and soon-to-be wealth recipients until after the wealth-transfer process has begun, it will experience a devastating economic blow and get left behind by those who have embraced this shift. The ball is in your court Everyone has predicted that the landscape of the wealth management, banking, and insurance markets will change dramatically due to the digital disruption and younger generations, but with the right strategy in place, your organization can emerge as a leader. Look at this as an opportunity to differentiate. A digital strategy will be the key to your success. Don’t look at digital as an application. Digital is the way all future generations will engage and interact. Leverage it today and do it well to tie the present with the future. Your formula for success is to create an actionable plan that is both informed and driven by insights and data on what people buy, how, what they expect, how they feel, and whether the experience is personalized, relevant, and timely. You need to understand your audience and use those insights to feed a strategy that ties into the mission and purpose of your customers. Bring your strategy to life in a digital channel that sits on top of flexible technology. Measure your customers’ experiences and level of engagement with your brand, and then make adjustments, continually working off of research and data. Follow this formula, and eight years from now, you’ll be the organization that is reaping the rewards because you understood how to keep millions of dollars from leaving your company. Need help improving your customer retention in banking? Let us know.
Ready to talk?
Let us know how we can help you out, and one of our experts will be in touch right away.