Let's make the pie BIGGER! How to increase customer satisfaction and increase revenues with your data

Last week I gave a presentation on how Big Data and data from the IoT can help businesses improve customer satisfaction levels throughout each part of the customer life cycle. Today’s customer, no matter the industry, expects to have a positive and personal experience with companies even before there is a formal relationship. After they become a customer or register on a website, they expect a higher level of personalization and engagement and to be rewarded for their loyalty. Throughout the presentation (which can be viewed on our YouTube channel), there were three themes that were repeated multiple times no matter the life cycle stage or industry example.

The first theme was to collect as much information about the customer and her preferences as quickly as possible. This is especially important before you have an official relationship with a customer. Well, you might ask, how can you collect information about a customer if you don’t know their buying preferences or even know who the customer is because they haven’t registered with your company? Each smartphone and computer is equip with a unique machine id. When you implement a Big Data or IoT solution it’s critical to record the machine id with either the potential customer’s pathways around physical store or browsing history on a website. Also, its critical to record as much information as possible. Don’t stop with just what pages a customer visited on your site. Record how much time a customer spent on a specific page (a longer visit probably means they’re reading the description and interested in that product or service), how far did they scroll down the page (you’ll know what they read and where they stopped), and if they scrolled through the product pictures. All of this information can then be used to make personalized recommendations if the customer returns to the website in the future from the same device. Also, if a customer does eventually register from the same device in the future, the information you collected about them in the past can be added to her new account.

The next thing you’ll want to include in a solution is to ensure that all interactions are made in real-time while the customer is in the store or on the website. This serves two purposes – first it makes the customer feel like they are getting personalized service, they’re becoming aware of products that they might not know about and that the company cares about their satisfaction levels. However, it also gives the company a chance to either up-sell or cross sell the customer --- thus in the words of one of my favorite marketing professors, making the pie bigger for everyone. If the interaction happens after the customer leaves the store, the chances of that customer returning to the store for that additional item or to take advantage of the promotion are much lower. Real-time responses are also very important not only while a customer is browsing through a store or website but also if a negative experience happens to the customer. This is especially important with the popularity of social media. It’s too easy for a dissatisfied customer to go to Twitter or Facebook and post a negative message about your company. You want to be immediately aware of the disservice and correct it before a customer has a chance to go to one of these outlets and post about their negative experience. A real-time message or correction from the company can prevent this whereas if the company waits even an hour or two the dissatisfied customer can post online and the damage to the company with that customer and all of the customer’s followers is already done.

Finally, there needs to be a balance of personalization with a respect for an individual’s privacy. Over the past month two of my neighbors have told me of how their Facebook account knew a little too much about them. In one case, a neighbor mentioned to his wife while he had his phone out that they should look into getting a Dyson --- there were no internet searches or visits to Dyson.com, just a mention to his wife about getting a Dyson. The next day there was a Dyson ad on his Facebook account. He was immediately “creeped out” by the fact that somehow his conversation had been processed by his phone and was then reflected in a Facebook ad. He immediately deleted Facebook from his phone. As a company, you need to remember that people want personalized yet not intrusive recommendations. It’s a tough balance at times, but it’s critical to the success of your Big Data or IoT solution.

By keeping these three takeaways in mind your solution will help nurture and maintain customer relationships.

Entrigna provides consulting services to help evaluate your system and its Real-Time Expert System platform is the only solution platform on the market that incorporates all of the major big data related algorithms in one seamless solution. We specialize in healthcare and retail solutions, but our technology let’s clients, no matter the industry, start small and then add-on or change their solution as their business needs grow and change. For more information on Entrigna’s consulting services or RTES platform visit our website at Entrigna.com or e-mail us at info@entrigna.com

Yes, I meant to say prescriptive

I arrived a few minutes early to a presentation on IoT Security a few weeks ago and introduced myself to the man sitting next to me. He asked what the company I worked for did and I responded “We’re a real-time prescriptive analytics company.” He looked at me and asked “prescriptive?” I get this quite often --- most people think I’m mispronouncing predictive. Don’t get me wrong, we can do predictive analytics, but prescriptive is our specialty and the way of the future! When people correct me, I have to explain that no, I really meant to say prescriptive. This of course is followed more often than not with a blank stare as I explain the differences between predictive and prescriptive. To most people the difference in those few letters doesn’t mean much. However, in reality there is a huge difference!

Well, what’s the big deal in saying your software is predictive instead of prescriptive? Predictive analytics does just that --- it predicts. It predicts when something is going to go wrong. So for example, I’ve just created a smart refrigerator  – it can tell you’re going to run out of oranges on Tuesday, that the milk is expiring Monday and when a part is going to go out in the next 72 hours. However, that’s just it --- it predicts when these events are going to happen. It doesn’t solve anything. If it was a refrigerator that incorporated prescriptive analytics, it would not just predict these events, but solve them – hence the prescribing. So, my new refrigerator would re-order oranges and milk on Instacart and have them delivered and best of all it would fix the broken part or correct what was causing it to mal-function. So, all in all, a prescriptive solution prescribes remedies to a problem that is occurring or will occur. It doesn’t just predict when things are going to happen.

So, at the presentation, I was getting ready to launch into my speech, but I didn’t have a chance. As soon as I said “Yes, prescriptive,” the man smiled and said, “That’s what my group does too! Whenever I say it, people think I’m mispronouncing predictive.”  Maybe the prescriptive future will be here sooner than I thought!

Hidden revenue streams for smart cities

Today, many cities are toying with the idea of becoming a “smart city” or a city that actively does everything from monitoring traffic patterns to predicting when a street light will go out to analyzing any digital information its collecting. However, while these are nice to know items that can make life easier for inhabitants, lower emissions and can help reduce costs by improving efficiencies, many times these incremental savings are not enough to justify the large upfront cost of outfitting multiple items throughout a city with sensors. Not only are the upfront costs of sensors and their installation high, but also there is typically a resistance to change from city leaders who are nervous about changing their current processes and taking on the risk of implementing a “high tech” project. As a result of this, smart city managers and project owners need to be able to justify high expenses by having a measurable ROI and ensure that the city will also be able to generate ongoing revenue from these improvements rather than just decrease operating expenses.

One way to generate revenues is similar to what Kansas City (MO) has done. Kansas City installed kiosks throughout a city with maps and local information for restaurants, attractions, events and shopping. The kiosks have the potential to generate several streams of income while collecting important information. Initially, installation of the kiosks can be paid for or subsidized by a semi-permanent advertiser that can display an ad on the outside of the kiosk. Thus, there is little or no cost to the city to install the kiosks. As for the ongoing revenue, the city can sell advertising space on the screen to different advertisers who can run ads or offer coupons to users.  In addition to this, users can purchase tickets to attractions, events or public transportation from these kiosks. A small fee can be charged to the company selling the ticket. Next, the information collected from the kiosks, such as what attractions/restaurants/events are being searched for or how many people or cars are passing the kiosk can be sold to businesses in the area.  These different revenue streams should not only pay for the upkeep of these machines, but also generate extra income for a city.

Another way to generate income deals with using the information that people are voluntarily giving to governmental agencies online. Many people would rather complete forms online rather than go to a government building, pay for parking, wait in line and many times realize after they’ve done all of this that they’ve left an important piece of information at home. More and more cities are allowing inhabitants to do many tasks such as renew a license plate or city sticker online. People would probably even be willing to pay a small convenience fee to complete these services online. However, even without this convenience fee, if governments can turn the information they’re collecting online into usable data insights and not just a big dump of data, local municipalities could then generate streams of revenue selling these insights. Also, if they could determine what type of person is using their websites they could display targeted ads on their websites, which would generate another source of revenue.

Smart city initiatives can be extremely beneficial to inhabitants, local businesses and the environment, however, they can be expensive to implement. When a city is planning on starting a “smart city” project, they should try to think outside the box for revenue generating opportunities rather than just how much money will be saved through increased efficiencies. In a smart city, the possibilities are endless for saving money, improving the locals and tourists’ time in the city, minimizing environmental impact and also generating new streams of revenue.

Oh the Places You'll Go.....with IoT

Last week we attended the IoT NA conference in Rosemont, IL. As I stood at the booth I was amazed at not only the number of regional companies that were attending but also the different industries that were represented. I spoke to people from traditional industries like automotive and tech manufacturing, but I also spoke with attendees from a children’s museum, a drone company and several small cities across Illinois. Conversations with these people from “non-traditional” sector really got me thinking about how IoT can be utilized in pretty much every sector and the first movers from non-traditional industries will not only be able to give themselves an unbelievable competitive edge but also provide an incredible experience for their customers.

When the attendees from the museum came by, I could tell that they were on a scouting mission. They definitely recognized the value of using data from IoT, but were unsure of where to start. It makes sense --- when looking through the agenda of this conference there were fantastic sessions on sensors, the power of analytics and even monetizing IoT data. However, most sessions were aimed at manufacturing, because of course, manufacturing with its preventive maintenance and process improvements is the current leader in IoT projects. However, it may have seemed like a one-sided conference, however, many of the principles discussed can be applied to non-traditional industries. For example, many museums already have apps. Museums could use geolocation on the user’s phone to track how people are walking through the museum. This information, just like data collected from forklifts driven around a distribution center can be used to see who is traveling where, where people are stopping and how many people are visiting exhibits during certain time periods. This information can be invaluable to a museum (or a store or an airport and the list goes on….). It can show designers if current pathways are intuitive, identify which parts of an exhibit are the most/least popular and create better pathways for guests. If this information is utilized, exhibits can be tailored to what patrons are really interested in and also make sure patrons can easily navigate through exhibits – thus a happier guest who is more likely to return. Additionally, museums could combine information from a customer’s past navigation history and purchase history with their current location in the museum or even time that they’re in the museum. In real-time the app could make recommendations to the patron for special events occurring that day (for example, a lego building session for a family that has previously purchased legos or walked through a lego exhibit in the past) or current sales in the gift store when they are getting ready to exit the museum.

At the end of the day, non-traditional industries will not have out of the box solutions targeted to their specific needs. However, with research on what other industries are doing and a little creative thinking non-traditional industries can make very powerful and differentiating solutions. To take a page from Dr Suess these non-traditional industries needs to think about all of the places they can go…..with IoT.

For more information on how to get started with an Internet of Things project, please visit our website or e-mail us at info@entrigna.com.

IoT in Transportation and Logistics

Recently, the idea of data and artificial intelligence has been making its way to the forefront of the news, especially in the form of driverless cars. However as frighten as this seems to many people, big data and the Internet of Things can play a very large and beneficial role in improving transportation and logistics without getting too Sci-Fiy for the average American.

So how can big data and the IoT help this industry? Well, obvious answers involve preventive maintenance. Don’t get me wrong, preventative maintenance is a great thing. By predicting when a part is going to break you can proactively replace it or service the machinery whether it’s an 18-wheeler or a piece of machinery. By having preventative maintenance as part of your business strategy, you’ll ultimately have more uptime which leads to higher revenue. However, this is so 2016. Preventative maintenance is just the tip of the iceberg when it comes to using the IoT in Transportation and Logistics.

One other idea on how companies can use IoT to streamline their processes involves prioritizing emergency calls. Many companies rely on dispatchers to get their service technicians to a call as soon as possible. However, when many service companies get a call to respond to an emergency, the dispatcher sees who is free in the area and then picks a person to respond to the call. Typically, there is not an automated process not only to identify what technicians are in the area but also how certain conditions such as weather, traffic or construction could possibly effect how long it would take the technician to respond to a call. In theory, a technician a mile away from a call could take much longer time to respond if a road is closed rather than another technical who may be ten miles away but coming from the opposite direction that doesn’t have traffic. The IoT can help solve this challenge. By combining current weather, traffic and construction conditions with the location of the technician, companies can automatically identify which technician or responder would make it to the call in the shortest amount of time.

The power of the IoT can also be seen by the use of RFIDs. RFIDs are tiny sensors that many parts and products are labeled with. Many companies use RFID technology to reactively manage inventory. So, when a company “does inventory” employees are told how many items are on a shelf instead of manually scanning each item on the floor and in the stock room (that was always my least favorite part of my college job at Banana Republic). However, the real power in these little chips is the ability to proactively manage your inventory. By using RFIDs, employees can be alerted when inventory levels are running low. Items can either be restocked from inventory or re-ordered. For more cool ways to see how RFIDs can be used in a logistics setting, check out our video on how the IoT is Revolutionizing Manufacturing.

For more information on how to get started with an Internet of Things project, please visit our website or e-mail us at info@entrigna.com.



Big Data in Agriculture

Over the next few months, I'll be writing blogs on some non-traditional industries that use big data. I'm looking forward to sharing updates and information on how big data can be used in all industries, not just the ones we typically associate with technology.

Farming is something most of us take for granted. We go to the grocery store and pick out our food without giving much thought to where our food came from or what went into growing it. We think of small quaint farms where farmers plant seeds, ride small tractors and then harvest their crops. However, many farms in the United States rely heavily on technology and are turning to big data to help them become more efficient, cost-effective, and less environmentally impactful.

Today’s tractors not only use sensors to collect information that help with preventative maintenance, but tractors also have multiple computer screens and sensors on them that collect information from everything from nitrogen and pH levels in the soil to how far apart the seeds are. Farmers tend to use this information while planting, however, many farmers do not use the information they’ve collected after the fact.

Farmers are also using “precision farming” to help make farming more efficient. This technique can mean many things, but ultimately it means using information about the soil and crops in a specific area to maximize the output of the crop and minimize the production cost for a crop. Farmers can use this information for everything from identifying the best places to plant certain crops to how many plants per acre they can plant.

In the future, we can expect to see more farmers adopting precision farming and other big data techniques. The big data market for agriculture is expected to grow from a $2.4B industry in 2014 to a $5.04B industry in 2020 (Research and Markets Global Precision Agriculture Market 2014-2020) and with the population projected to grow to 9 billion people by 2050, farmers will need to increase outputs significantly to keep up with demand. We’re already seeing some very interesting ways that precision farming and big data solutions us can be implemented at larger facilities. For example, Gallo Winery recently implemented a system that takes satellite imagery of their vineyards and determines which plants are getting too little or too much water. The images are processed, analyzed and then the sprinkler that is connected to an individual plant is automatically adjusted to give it either more or less water.  Water consumption at Gallo Winery has been reduced by 25% since the system was implemented, the health and production of the plants has increased and the costs associated with workers manually watering individual plants has decreased.

The real power of big data will be when farmers start sharing their data with companies. In the past, farmers have been very hesitant to share the data they collect to corporations. Many farmers view the information from their fields as propriety and are worried that the information generated from their farms will be shared with commodity traders or other farmers. They are also worried that seed and equipment companies will use the information to sell farmers higher prices goods. However, seed and equipment companies need information from individual farms in order to improve their software and products so farmers can keep achieving the best results possible. In the next few years, I believe seed and equipment companies will start focusing on how to earn the trust of farmer and proactively show farmers how sharing this information will lead to substantial ROIs for the farmers. Also, as time progresses farmers will become more comfortable with big data and the technologies and realize that the payoff of higher yields and ultimately lower costs will persuade farmers to share their data.

Trends in Big Data and the IoT in 2016

As we enter the new year, it’s always an exciting time to reflect upon the previous year and ask “What new things will happen next year?” Over the past year, it’s been really cool to see how executives at companies are realizing the value of using big data instead of just collecting it.  Because of this trend, 2016 should bring about disruptive changes in the big data and internet of things markets.

Some of the top trends in 2016 that I see happening are

Customer satisfaction levels will be influenced by an automatic personalized experience

As consumers become more tech savvy and more millennials have discretionary income, more consumers will continue to adopt and use mobile apps such as Target’s Cartwheel or PriceGrabber while they’re shopping. These consumers are looking for a personalized experience that will give them some benefit, whether it’s a lower price or targeted advertising or coupons based on past behaviors, when shopping. Consumers have many options to choose from when shopping both online and in-person and will ultimately pick the store that gives them the most value and the best shopping experience.

Additionally, with the increase of internet shopping and the multitude of stores available to consumers, companies will start relying more on what an individual is clicking on and posting on-line about products and her shopping experience. In the past, companies have had challenges making sense of this information in a timely manner and then reacting. However, companies are starting to discover solutions that can help them not only react in real-time to a customer’s shopping experience but also personalize the customer’s shopping experience based on past behaviors or trends. These proactive actions should lead to higher level of customer satisfaction for the customer.

Using ROI in big data

Executives are pushing for the adoption of big data solutions however, many executives want to see a measureable ROI and meaningful use cases before they make a large investment in a solution. In 2016, solution providers will start partnering with their users to determine the ROI of using a solution. Many times these measurements can be straightforward, such as calculating how much revenue is saved when using data sensors to predict when parts will wear out.  However, calculating the ROI on other solutions that combine structured and unstructured data will be more challenging to determine.

Data in the Internet of Things will start to be used instead of just collected

Sensors on many devices will help companies predict when parts need to be serviced and can also predict anomalies in the overall system. However, many companies have yet to realize the full potential of this data. In 2016, more companies who collect this type of information will no longer just store it but start to use this information to help prevent down time and achieve better customer service. Also, with the increased adoption of personal healthcare devices, such as Fitbits and smart watches, more consumers are going to start tracking their own healthcare.  Companies that provide solutions that monitor and make recommendations on a consumer’s heart rate, blood pressure or fitness activity will grow.

The need for simplified Big Data

Currently, many of the traditional big data solutions that make real-time decisions require users to be very tech-savvy and require substantial coding. However, in 2016, we will probably see more companies purchasing tools that can be easily used by non-technical users. This is because there is currently a shortage of data scientists and the average salary of an entry level data scientist is quite high compared to that of an entry level analyst. Many companies just can’t afford to have data scientists on staff.  Also, customer facing groups want to be able to see results in real-time and not wait for the IT or data science group to get them the information they need. Solutions will still need to be set up by data scientists and software engineers, however, once the solution is set up, non-technical groups such as marketing and customer service will be the ones accessing the data and writing simple queries to find the information that they need in real-time.

2016 will definitely be an exciting time for big data! The Entrigna team is looking forward to working with companies in the next year to discover how we can help them make and achieve their big data goals! For more information on Entrigna please e-mail info@entrigna.com.

Value Proposition of Business Decisions - A Systemic Perspective

In today’s competitive environment, critical & timely business decisions significantly impact business outcomes such as improving customer relationships, increasing revenues, optimizing cost & resources, improving performance, maximizing operational efficiencies and even saving lives. The ability to make business decisions intuitively & pertinently is heavily dependent upon availability & accessibility of business information & data. Every business event, such as a customer purchasing a product, yields business data. Such data, resulting from business applications, processes, transactions, operations, business-partnerships, competition etc. inherently contains valuable knowledge & business insights about customers, products, policies, systems, operations, competitors etc., that helps in making business decisions. Typical steps in deriving decisions involve collecting required data, analyzing data by applying intelligence-mining techniques & business rules, extracting interesting insights & new intelligence, understanding context & applicability of such information and finally arriving at decisions in terms of what business actions can be taken.

The value proposition of a business decision is measured in terms of its effectiveness in generating expected benefits while accomplishing one or more business goals & outcomes. There are many factors that affect the effectiveness or the value of a business decision. One of the key factors is the decision-action-latency which is defined as the total time taken, after business event(s) occurred, to collect required data, analyze data, extract new insights & intelligence, understand the applicability of such new information and finally arrive at actionable decisions. According to Dr. Richard Hackathorn, an eminent BI analyst & creator of Time-Value curves, the value of data required to make an actionable business decision degrades as time lapses by after pertinent business events have occurred. This is shown in the following Time-Value curve:

Click on the image to enlarge

The decision-action-latency in turn is cumulative of 1] ‘data-latency’ defined as time taken to collect and store the data, 2] ‘analysis-latency’ defined as time taken to analyze the data & extract new insights & new intelligence and 3] ‘decision-latency’ defined as time taken to understand the context & applicability of such new insights & intelligence and to arrive at decisions in terms of what business actions can be taken.

It has to be mentioned here that business decisions can be strategic or tactical in nature. In case of strategic decisions, the value or effectiveness is potentially realized even though the underlying data used to make decisions can be very old accumulated over longer periods of time. Essentially the slope of the Time-Value curve would be small per se with very gradual decrease in value over time. Typically, strategic decisions are based on mining large data comprising of historical observations collected from several business events over a period of time. A retail store making a decision about when to run beer sales is an example of a strategic decision. For example, a retail store after inferring from store sales data that men who purchase diapers over the weekend also tend to buy beer at the same time can make a strategic decision to capitalize on this information to put the beer near the diapers and run beer sales over the weekends.

In case of tactical decisions, the value or effectiveness of business decisions is very short-lived because underlying data/information is highly volatile and inherently contains time-sensitive intelligence reflecting upon the momentary business performance. Essentially, the slope of the Time-Value curve would be very high with the curve being extremely steep. Typically, a tactical decision pertains to a single business event or transaction and hence is based on data collected from a single event that gets compared to or correlated with associated/related data collected from most-recent related business events. Credit card fraud detection is an example of a tactical decision. For example, a credit card company after inferring that a credit card, being used to purchase an item in Chicago, was used thirty minutes earlier in a place somewhere across the globe, can make an immediate tactical decision to capitalize on this information to mark the transaction as fraud and place a hold on the card.

So how do companies ensure that value proposition of business decisions is retained & realized, regardless of strategic or tactical decisions?

Traditional Data Warehouses:

As IT evolved over the years, companies automated their operations to collect data for analysis & reporting. In early days, each business application would capture such data in its own 'Reporting' database. As more operational automation got implemented, such 'islands of information' became siloed and proliferated. Soon companies realized the analytical value if the data from all such siloed islands is collectively mined together & correlated. However, collecting & correlating data from all such siloed systems was a challenge due to the incompatibilities between systems and lack of easy ways for these systems to interact & interoperate. A need for an infrastructure to exchange, store and analyze the data so that a unified view of insights & intelligence across the enterprise can be created, was recognized and thus Traditional data warehouse evolved to fill this need. Traditional data warehouse organized information from different sources so that data can be effectively analyzed to generate interesting & meaningful reports. Such reports would provide key insights to make business decisions which would then lead to course-corrective actions. To help smoothen the decision making process, a broad range of tools were developed over the years such as Extract, transform, and load (ETL) utilities for moving data from the various data sources to common data warehouse, Data-mining & Analytical engines to perform complex analysis and Reporting tools & Digital Dashboards to provide management & business decision makers with easy-to-comprehend analysis results.

In spite of such attempts to automate the decision making process, the process task to analyze the data & extract new insights & intelligence, the task to understand the context & applicability of such new insights & intelligence and the final task to arrive at decisions in terms of what course-corrective business actions can be taken remained manual to large extent. As shown in the following diagram, the Time-Value curve in case of leveraging traditional data-warehouses had long time-latencies. As such, traditional data warehouses were predominantly leveraged for making strategic business decisions in supporting strategic goals such as reducing costs, increasing sales, improving profits, maximizing operational performance and fine-tuning operational efficiencies by mining & analyzing massive amounts of data collected from across the enterprise over a long period of time. However, traditional data warehousing has little tactical value since the data in it is generally quite stale and can be weeks or months old. There were attempts to incorporate new technologies to minimize time-latencies however such attempts could only be successful in reducing data-latency by further automating data capture processes while both analysis task and decision task remained mostly manual.

Click on the image to enlarge

Active or Real-Time Data Warehouses:

The need for a solution that satisfies both the strategic and the tactical goals of an enterprise resulted in the emergence of Active Data Warehouses or Real-Time Data Warehouses. Sometimes these are also referred to as Real-Time or Right-Time Business Intelligence (RTBI) systems. Active Data warehouses not only support the strategic functions of data warehousing for deriving intelligence and knowledge from past enterprise activity, but also provide real-time tactical support to drive enterprise actions that react immediately to events as they occur. The new breed of data warehouses are designed to reduce all three latencies as much as possible by revamping utility tools. The traditional ETL process involved downtime of the data warehouse while the loads were performed. As such they are characterized as being offline ETL facilities. However Active data warehouses needed online ETL facility that not only preserved historical strategic data but also provided current tactical data. The online ETL’s job is to create and maintain a synchronized copy of source data in active data warehouse while constantly keeping it up-to-date. Besides, Active data warehouses needed improved data-mining & analytical engines with ability to incorporate business rules and with flexibility to run analytical models that can consume & adapt to more recent data blended with historical data. In effect, Active data warehouses markedly reduced overall decision-action-latency and thereby tremendously increased the value-proposition of business decisions in meeting business goals as compared to that of traditional data warehouses. In addition, they offered flexibility in making tactical decisions as and when needed by an enterprise. The following diagram shows the Time-Value curve in case of leveraging Active/Real-Time Data Warehouses.

Click on the image to enlarge

Real-Time-Intelligence-Based Decision Systems:

Even though Active data warehouses reduced overall decision-action-latency & thereby increased the value-proposition of business decisions, they were still predominantly used in a traditional sense with a strategic intent albeit leveraging most recent/current data. They were never considered as systems that can act as pure providers of 'Real-Time-Intelligence-Based' decision services. Such Real-Time-Intelligence-Based decision systems would churn & process varying business operational & transactional data on a real-time basis, sense transitory business insights, predict business fore-sights and use such reasoning to make real-time decisions that can then effect immediate actions through business transactions & operations. Such decision system would agglomerate capabilities such as Machine Learning, Data Mining, Rules Processing, Complex Event Processing, Predictive Analytics, Operations Research type of Optimizations, Artificial Intelligence & other Intelligence-Generating Algorithmic techniques and would provide flexibility to mix & match such capabilities for more complex decision orchestrations. The breadth of decision frameworks is necessary because different business objectives require different analytical approaches. For example, a rules engine works great when recognizing a customer for a milestone. Likewise, event processing is well suited for identifying potential customer dis-service scenarios. Finally, optimization techniques work well when making decisions about which promotions to place in front of the customer.

Real-Time-Intelligence-Based decision system would process live-data from business events as they occur, combine the event data with other valuable data or other events data, extract intelligence from such data and derive a decision as to what action should be taken. Sometimes, knowledge of the event is sufficient information to derive an insight and take action. More often, additional data is needed to correlate & improve intelligence. One another key feature of such 'Real-Time-Intelligence-Based' decision systems would be to instantaneously learn, adapt and adjust decision models & business rules as soon as new data is fed-back from business events. Such spontaneous processing of business events data combined with instantaneous adaptation of decision models based on data fed-back, effectively eliminates 'data-latency', 'analysis-latency' and any latency incurred otherwise in re-engineering the decision models from ground-up. As such, the maximum tactical & transient value associated with business event data is fully preserved & exploited while effecting an immediate business action based on real-time business decision. The value proposition of such as system is depicted using a similar Time-Value curve where the latencies are in micro to milli seconds and any perceived loss in business value is almost nil.

Click on the image to enlarge

Entrigna’s proprietary product RTES falls under ‘Real-Time-Intelligence-Based’ Decision System. For more information, refer to the blog titled ‘From Real-Time Insights To Actionable Decisions - The Road Entrigna Paves’.

Hope you found this blog informative.

Data & Computation challenges in Machine learning/Predictive Analytics – An Architect’s POV

While building complex machine learning and/or predictive analytic algorithmic models, huge amounts of historical/sampled data is consumed to train & tune the models. Typically, data comprises of some past observations with qualitative and/or quantitative characteristics associated with each observation. Each such observation is commonly referred to as an ‘example instance’ or ‘example case’ and observation characteristics are commonly referred to as ‘features’ or ‘attributes’. For example, an observation about certain type of customers may contain ‘height’, ‘weight’, ‘age’ and ‘gender’ and these will form the attributes of such observations.

A set of past observations is assumed to be sampled and picked at random from a ‘data universe’. Some ‘attributes’ are of numeric type and some others are ‘discrete’. Numeric type examples are ‘height’, ‘weight’ and ‘age’. Attribute ‘gender’ is discrete with ‘male’ and ‘female’ as values. Another example of discrete attribute is customer ‘status’ with values such as ‘general’, ‘silver’, ‘gold’ and ‘platinum’. Discrete attributes can only take particular values. There may potentially be an infinite number of those values, but each is distinct and there's no grey area in between. Some numeric attributes can also be discrete such as currency coin denominations – 1 cent, 5 cents (nickel), 10 cents (dime), 25 cents (quarter) as in US currency. Non-discrete numeric attributes are considered as ‘continuous’ valued attributes. Continuous attributes are not restricted to well-defined distinct values, but can occupy any value over a continuous range. Between any two continuous data values there may be an infinite number of others. Attributes such as ‘height’, ‘weight’ and ‘age’ fall under ‘continuous’ numeric attributes.

Before any algorithmic models are developed, the entire set of observations is transformed into an appropriate mathematical representation. This includes translating qualitative discrete attributes into quantitative forms. For example, a qualitative/discrete attribute such as customer status can be translated into quantitative form by representing value ‘general’ as 0, ‘silver’ as 1, ‘gold’ as 2 and ‘platinum’ as 3. Numeric attribute values are typically normalized. Normalization involves adjusting values measured on different scales to a notionally common scale. Most commonly used Normalization is Unity-based Normalization where attribute values are scaled (down) to fall into a range between 0 and 1. Numeric continuous attributes are discretized. Discretization refers to the process of converting or partitioning continuous attributes to discrete or nominal values. For example, values for a continuous attribute such as ‘age’ can be partitioned into equal interval ranges or bins; ages falling between 1 year and 10 years, 11-20, 21-30, 31-40 and so forth which is typically referred to as ‘binning’. Finite number of such well-defined ‘bins’ are considered and based on the ‘bin’, attribute value falls into, a distinct ‘bin’ value is assigned to that attribute. For example, if ‘age’ falls in 1-10 ‘bin’, a value of 1 may be assigned; if it falls in 11-20 ‘bin’, a value of 2 may be assigned and so forth.

Other critical issues typically encountered in dealing with attribute values are ‘missing values’, ‘inliers‘ and ‘outliers’. Incomplete data is an unavoidable problem in dealing with most of the real world data sources - (i) a value is missing because it was forgotten or lost; (ii) a certain feature is not applicable for a given observation instance (iii) for a given observation, the designer of a training set does not care about the value of a certain attribute (so-called don’t-care value). Such observations with missing attribute values are less commonly discarded but most often techniques are applied to fill-in ‘closest possible missing value’ for the attribute. Such techniques are commonly referred to as missing data imputation methods. Similarly, both ‘inlier’ and ‘outlier’ values would need to be resolved. An 'inlier' is a data value that lies in the interior of a statistical distribution and is in error. Likewise, an 'outlier' is a data value that lies in the exterior of a statistical distribution and is in error. There are statistical techniques to detect and remove such deviators, simplest one being removal using quartiles.

Data preprocessing also includes attribute value transformation, feature extraction and selection, dimensionality reduction etc., based on applicability of such techniques. Overall, data preprocessing can often have a significant impact on generalization performance of a machine learning algorithm. Once observations are converted into appropriate quantitative forms, each observation is considered to be ‘vectorized’ – essentially each observation with quantified attribute values is supposed to be describing a specific point in a multi-dimensional space and each such point is assumed to represent a ‘position-vector’ with its attributes as coordinate components in each dimension. In almost all cases of building machine learning models, matrix/vector algebra is extensively used to carry out mathematical/algebraic operations to deduce the parametric values of the algorithm/model. As such the entire set of all observations, now converted into ‘position vectors’, is represented by a matrix of vectors – either as a column matrix where each observation-vector forms the column or as a row matrix where each observation-vector forms a row, based on how machine algorithm consumes such matrices.

The following are key infrastructure/technology challenges that are encountered right away while dealing with such matrix forms:

- We are talking about, not hundreds or thousands of observations but typically hundred thousands or millions of observations. As such a matrix or even if observations are represented in some other formats, may not fit into single process memory. Besides, there could be hundreds/thousands of attributes associated with each observation magnifying the size further – imagine 1 million by 100K matrix or even higher! As such underlying technology/infrastructure should address this issue meticulously and in a way NOT affecting the runtime/performance of machine learning algorithm

- Persisting chunks of such data/matrices and reading them from disk on-demand as and when needed, may not be an option at all and even if it looks feasible, will not eliminate the problem adequately

- While training/tuning the model, there will be intermediate/temporary matrices that may be created such as Hessian matrices, which would typically be in similar shape & size or even larger, thereby demanding equal proportions or more of memory

- Even if some clustering/partitioning techniques are applied to disperse memory across a network of server nodes, there would be a need to clone such memory partitions for redundancy and fail-over – similar to how Hadoop distributed file system (HDFS) copies chunks of large file data across multiple data-nodes

- Complex matrix algebra computations are typically carried out as part of model training & convergence. This includes not just so-called-simple operations such as matrix-matrix multiplications, scalar operations, additions, subtractions but also spatial transformations such as Givens rotations, Householder transformations and so forth typically executed as part of either eigenvalues extraction, determinant evaluation, matrix decompositions such as QR, singular value decomposition, matrix bi-diagonalization or tri-diagonalization. All such operations would need to be carried out with high-efficiency optimizing memory foot-print and without high latencies

- Most algorithms would need several iterations of training/tuning and each such iteration may take several minutes/hours/days to complete. As such the underlying infrastructure/technology, apart from being scalable, should be highly reliable and fault tolerant. For a model that inherently takes long durations to get trained/tuned, it is undesirable & unimaginable to have some system component crash resulting in entire model building getting restarted again from ground up. As such, long running models should have appropriate ‘save’ points such that an operation can be recovered/restored safely without having to restart and without losing mathematical meaning/accuracy

- Underlying technology/infrastructure should help take advantage of high-speed multi-core processors or even better, graphical processing units (GPUs) resulting in faster training/convergence of complex models

- And most importantly, the underlying technology should address all the above concerns/issues in a very seamless fashion with easy-to-use & easy-to-configure system from both algorithm modelers and from system administrators’ point of view

In the next blog, I will describe how Entrigna addresses many, if not all, of these issues/concerns leveraging Entrigna’s RTES technology solution.

From Real-Time Insights To Actionable Decisions - The Road Entrigna Paves

What problem does Entrigna solve?

With tremendous increase in computing power and decrease in memory & storage costs, today’s businesses are weighed down with a deluge of mostly disconnected data, much of it highly relevant to effective decision-making; data associated with business applications, processes, transactions, operations, customers, customer insights, products, product insights, policies, systems, business-partnerships, competition etc. produces unmanageable amounts of information in many different formats. Such data is highly volatile and inherently contains time-sensitive business intelligence reflecting upon momentary business performance. If detected real-time, such ‘in-the-moment’ business intelligence can provide real-time insights that can be used to dynamically determine an optimal action to be taken in real time; Essentially such data would need to be ploughed through to discover valuable knowledge & instantaneous insights in a way to make actionable decisions and feed such decisions in real time back into business applications, processes and operations in a way to drive business value & profitability. A few examples of such insights & decisions are: Product recommendations based on customer actions & purchase behavior, Offer optimization based on customer insights, customer churn prediction, customer disservice detection and recommendation of recovery actions, dynamic pricing of products & pricing optimization based on 'in-the-moment' shopping-versus-buying patterns, real-time predictions of perishable inventory shrink, anomaly detection such as fraud and recommendation of course-correction actions.

To derive intelligence and insights in real-time from such volatile & varying data, there is a real need for technology with capabilities such as Machine Learning, Data Mining, Rules Processing, Complex Event Processing, Predictive Analytics, Operations Research type of Optimizations, Artificial Intelligence and other intelligence-generating Mathematical Algorithmic capabilities coupled with flexibility to mix & match such capabilities for more complex decisions orchestration. The breadth of decision frameworks is necessary because different business objectives require different analytical approaches. For example, a rules engine works great when recognizing a customer for a milestone. Likewise, event processing is well suited for identifying potential customer dis-service scenarios. Finally, optimization techniques work well when making decisions about which promotions to place in front of the customer.

Another challenge such technology would need to address is processing & correlating high-velocity data in live-streams coming from disparate data sources and in wide-variety of formats. As such technology should be highly scalable and fault-tolerant with extensive provisioning for large memory distribution & massive parallelization for executing complex CPU bound operations. From licensing & maintenance point of view, such technology should be cost-effective and economically viable. However, there is no single technology that is readily available offering all these capabilities out-of-the-box and with seamless implementation.

Why is this problem a big deal?

There are commercially available technologies that specialize individually in one required capability or the other. For example, there are sophisticated Business Rules Engines, technologies that excel in Complex Event Processing, those that excel in Data-Mining, in Operations Research, in traditional Business Intelligence etc. Each technology perhaps works well within its realm of decision-making capability. In almost all cases such specialized technologies come from different product vendors. There are a few vendors that offer some of these technologies as part of a suite, but as independent products within the suite. As such these technologies are not necessarily designed to speak to each other & inter-operate. However for real-life complex business decisions orchestration, such capabilities would need to be combined in a flexible and seamless fashion.


Many businesses leverage traditional Business Intelligence technologies alongside Data-Warehouses & Data-Marts with sophisticated data mining capabilities. Such traditional approaches are either time-driven or request-driven. Operational & Transactional data is staged and analytically processed in batch mode. Data mining & model building are static mostly purposed to create reports & feed results to dashboards. Human reasoning is still needed to understand the business insights and trends so that any course-correction actions can be suggested/implemented for adapting business processes to new insights. This entire process of extracting business insights & trends spans from days to weeks and by the time such business insights are applied to business processes, the business environment might evolve further making the insights stale and potentially counter-productive.

Other businesses procure individually specialized technologies as mentioned before and make them inter-operate by developing custom middleware solutions. Even then deriving insights & actionable decisions is not comprehensive because required decision orchestration is still not seamless and not fully realized since it is like a split-brain across disparate technologies. This entire saga is prone to large capital investment spent in procuring disparate technologies and in developing custom middleware solutions to make such heterogeneous technologies inter-operate. As such many business initiatives with an urgent intent of exploiting maximum advantage from real-time insights are prone to long delays with long project timelines. Moreover, because decision orchestration cannot be fully realized, such business initiatives get implemented with limited-scope with many requirements de-scoped resulting in businesses losing original business value proposition, a loss typically measured in millions of dollars, besides losing competitive edge.

How does Entrigna solve this problem?

Entrigna developed a real-time decisions platform called RTES – Real Time Expert System. RTES enables real time decision capabilities by offering full range of decision frameworks packaged in one technology that work together seamlessly; Rules Engine, Complex Event Processing, Machine Learning, Artificial Intelligence, Optimization, Clustering/Classification. Essentially, RTES platform exposes such decision frameworks as built-in modularized services that can be combined and applied on an organization’s business data on a real-time basis to identify intelligent patterns that can lead to real time business insights. By packaging such decision capabilities in one technology, RTES enables seamless orchestration of higher-level decision services - meaning an hybrid decision service can be configured as network of individual decision services, as in electrical circuit, in-series and in-parallel. Individual decision services can be rules based, machine learning based, classification based, segmentation/clustering based, predictive or regressive, real-time optimization based.

Since RTES works on live-streams of data, it promotes event-driven data integration approaches. An event can be any number of things but is usually initiated by some action or activity; examples include, a customer shopping for tennis rackets online, sale of 1000 winter gear items in last 1 hour, a snowstorm being predicted for North-East, a flight getting rescheduled; RTES attempts to extract events from live-streams of data in order to initiate the real time decision process.

RTES processes live-data a.k.a events as they occur, combining the event with other valuable data or other events, gaining intelligence from the data and deciding on an action to be taken. Sometimes, knowledge of the event is sufficient information to derive an insight and take action.  More often, additional data must be leveraged to improve intelligence. For example: customer profile, transaction/sales history, channel interaction history, social activity history, external data like weather & traffic. RTES employs data architecture strategy commonly referred to as Data Virtualization that integrates disparate data sources in real time into a usable format while decoupling data processing aspect from intelligence derivation aspect.

To enable derivation of intelligence, RTES makes it easy to combine different decision frameworks. For example, to implement a specific offer optimization requirement, RTES enables use of decision trees to determine customer value score, clustering to segment customers based on customer attributes, neural networks to assess purchase trends by customer segment, optimization to rank most-value-generating products, additional rules to further personalize offers to a specific customer and CEP to augment offers based on external events such as weather & national events - all of these orchestrated seamlessly within one single technology .i.e. RTES.


Once actionable decisions are determined, RTES enables such decisions to be integrated with business applications, processes and operations to enable action in real-time that impact business outcomes. For example, presenting optimized & personalized offers to a customer in order to help that customer buy his/her choices of products more easily such that business objective of increased product sales is met. RTES makes actionable decisions accessible by means of web services, messaging, direct web-sockets, database adapters and other custom adapters.


RTES enabled machine-Learning and AI based predictive decision models can also learn and adapt based on feedback from actions taken. Online predictive models learn from action feedback in real-time while Offline predictive models learn in batch mode from action feedback that is stored first & consumed later. Typically such feedback is enabled for other business purposes such as Enterprise Data Management & Warehousing and RTES can tap into existing feedback channels, without having to necessarily devise new ways of consuming feedback.

How does Entrigna engage clients?

Below are high level steps that Entrigna would typically follow while initiating a client engagement.

Working collaboratively, Entrigna would engage with client by listening to client’s requirements for leveraging real-time intelligence paying close attention to business goals. This is a mandatory step more so because the concept of real-time intelligence is relatively new and as a product & services provider, it becomes critical for Entrigna to streamline client’s ideas, dotting the I’s and crossing the T’s and in the process steering the client realize much more business value off of real-time insights than initially anticipated. This includes capturing client's thoughts around what they presumed as possible solutions versus what they assumed as infeasible, impracticable, anecdotal or imaginative ones, something they thought not implementable at all because technologies that they are aware of, lacked required capability. Of course, this step is very much preliminary with the understanding that Entrigna would help realize more cases & opportunities for real-time intelligence during the course of actual ensuing implementation.

Once client’s initial requirements are discovered, next natural step is to thoroughly understand two important business aspects; 1] business processes, applications & operations where real-time insights-driven-actions would be implemented and 2] data, types of data, potential sources of existing data and sources of new data that would come into play for plowing through intelligence.


Next step is a bit more involved step where hypotheses for different intelligence scenarios are actually given shape in the form of algorithmic models. Entrigna would employ elements of data-science applying them to client’s specific needs. This is very much a collaborative step wherein client’s subject matter experts would work hand-in-hand to vet the intelligence models. Entrigna would quickly prototype more than one model - trained, tuned & tested. Typically, the differences in the models are more in terms of mixing & matching of underlying decision frameworks. Entrigna would then share & review the model results to verify if what models predicted is close to or exceeding what client anticipated. In case there is a need to increase results accuracy, Entrigna would either further fine tune underlying decision algorithm or replace the existing one with a more advanced decision algorithm, ensuring applicability of such decision algorithms within mathematical feasibility boundaries set by data-science.

Finally, once the models are finalized, Entrigna would determine how decisions would get integrated into and consumed by downline applications, processes or operations; thereby Entrigna would expose models as intelligent decision services with appropriate service access mechanisms such as web services, messaging etc.

Tell us if this blog helped and please do share your comments!!