Case Studies

Leveraging Synergies to Tackle Big Projects

Business Problem

The owner of the boutique consulting firm contacted Omni Analytics seeking assistance with analyzing survey data from their corporate clients; Milestone Systems, Fristads Kansas and Kamstrup.

Data

Utilizing the Qualtrics platform for survey deployment, Omni Analytics provided value by handling the data formatting and performing the analysis with k-means clustering and natural language processing.

Outcome

O.A.G. assisted the firm in becoming a leader in Denmark’s marketing analytics service industry by expanding their data science capabilities. Our direct involvement resulted in project extensions and renewed contracts for all of firms aforementioned clients.

Techniques and Technology

Advanced R Programming, K-Means Clustering, Self Organizing Maps, Survey Analysis

Participant Segmentation

Business Problem

A local online start-up provides a matching service that connects researchers to potential participants. The founder wants to expand his business to enterprise researchers, but needed a deeper understanding of his participant pool in order to highlight demographical clusters that may be interesting to his new enterprise customers.

Data

Omni Analytics was given access to the participant pool data containing self-reported demographics.

Methodology

We restructured the data, then performed an unsupervised learning analysis to uncover clusters of demographics.

Outcome

At delivery, our client expressed surprise at the clarity and clear demarcation of his participant pool. He was reassured to know that he possessed a diverse set of clusters to allow his potential enterprise customers to A/B test across various demographics quickly.

Benefits

These groupings immediately became automated filters within the website in anticipation of the enterprise platform launch.

Techniques and Technology

K-Means Clustering, Data Strategy, Management Consulting

Anomaly Detection for Incorrect Data Submissions

Business Problem

A popular website for tracking car fuel efficiency was in need of analysis to help determine when users were submitting incorrect data on their car’s mileage.

Data

To collect data, users login to the website and insert the amount spent at the pump, along side their odometer readings. Unfortunately, it is all to common for users to input the wrong information. The site owner wanted a way to statistically characterize potential erroneous entries.

Methodology

Our team, leveraged the outlier package in R to develop a scheme for assessing these observations. Given previously entered data for that specific car and general information from other similar users, we were able to derive statistical bounds that would identify potentially incorrect datum.

Benefits

Our client implemented these error checking rules into their app as an improvement to the overall user experience.

Techniques and Technology

Outlier Analysis

Antibiotic Efficacy Analysis on Chlamydia Trachomatis Strains

Business Problem

As another outreach project for our student assistance initiative, Omni Analytics helped a graduate student compare the efficacy of antibiotic agents on Chlamydial Strains.

Data

For this project, the data originated from an experiment that measured the MCC and MIC responses for cervical and urethra swabs taken after patients received a set dosage of Azithromycin, Doxycycline or Levofloxacin.

Methodology

Using standard parametric and non-parametric difference in mean tests, we correctly identified Doxycycline as the most potent drug to kill Chlamydial strands.

Outcome

With our assistance, the graduate student successfully defended for their PhD in biomedical and health sciences.

Techniques and Technology

Non-Parametric Distributional Analysis

Machine Learning for Alternative Credit Scoring of the Historically Unbanked

Business Problem

Based in Lima, Peru, a fin-tech company was leveraging traditional statistical techniques to assist in loan disbursement for the historically unbanked. As the company grew, they sought out Omni Analytics to help push forward an initiative to incorporate machine learning best practices for the development of more accurate credit risk models derived from their psychometric assessments.

Data

O.A.G. was initially engaged after a successful private data competition where we outperformed not only their internal team, but also six other consulting companies. We then collaborated, working side by side to formalize their modeling procedure for future scalability. As the initial initiative concluded, O.A.G. provided further research into the implementation of alternative data sources, new machine learning ensembling techniques, as well as general statistical consulting support and on-site training.

Methodology

During the course of our collaboration, we reduced their modelling time from 3 days per model to less than one hour while increasing the accuracy by over 10%. With the final implementation of the automated system, they were able to construct ensemble models within minutes. During this period, they also saw over 100% growth in clients and scorable tests which were all successfully served and analyzed thanks to the new framework.

Techniques and Technology

Regularized Regression, LASSO Regression, Logistic Regression, Gradient Boosted Machines, Advanced R Automation, Specialized R&D

Global Customer Segmentation

Business Problem

A Fortune 500 company’s health and nutrition division, one commanding yearly revenue of over $300M, wanted to better understand the needs of their customers. As part of a consortium of consultants, Omni Analytics was charged with designing, implementing, and leading the technical portion of their global customer segmentation study.

Data

In this study, nearly 1500 respondents across 8 global regions supplied answers to an online conjoint study requesting participants to articulate their attitude towards the pricing, lead time, ease of ordering and other aspects of the supply chain with respect to their interaction with the Fortune 500 host company.

Methodology

Conducted on the Survey Monkey platform, the information was transferred onto the Omni Analytics servers where hierarchical Bayes analysis was performed. The resultant part-worths were clustered using both K-means and self organizing maps in an effort to extract meaningful, but uncorrelated clusters.

Outcome

Our analysis identified intuitive customer segments such as those who are price sensitive, but care about having ample technical support as well other groups that were not price sensitive, but wanted a high quality product as shipped as fast as possible.

Benefits

The client was able to develop a meaningful global strategy that attributed to the increase in year over year sales after it’s successful implementation.

Techniques and Technology

Advanced R Programming, K-Means Clustering, Self Organizing Maps, Survey Analysis, Bayesian Analysis

Design and Implementation of an Automated Reporting Solution

Business Problem

Our client produced a monthly newsletter using manually updated Excel data and Adobe Indesign. Month after month, an already swamped system administrator would dedicate 25 hours to the cumbersome process of producing that month’s report.

Methodology

Omni Analytics was hired to take ownership of the process. We redesigned the data ingestion procedure, reducing the manual procedure to nothing more than a one time copy-paste procedure. All other steps, involving the recalculation of statistics, graphs and outputs were coded into R and converted into an R Markdown report.

Outcome

Upon project delivery, our automated report solution reduced the publication time from 2 weeks to only 1 day.

Techniques and Technology

Rmarkdown, knitR, Advanced R Automation

College Major Recommender System

Business Problem

A Canadian start-up was looking for technology that would allow them to recommend college majors to high school students. Working against shortened attention spans, but regulatory requirements, they needed an intelligent algorithm that would ask the fewest number of questions, but still maintain a high level of predictive accuracy.

Data

To collect data, university students were tracked from their entrance into the university until their successful matriculation. Demographics, as well as their answers to the questionnaire were stored across time.

Methodology

Drawing on previous experience, Omni Analytics assisted in establishing the field of study matching process. In addition to the creation of the questionnaire guidelines, OAG developed an innovative intelligent variable selection algorithm that uses past responses to select the most appropriate question to ask in real time.

Outcome

Utilizing our algorithm, the start-up was able to secure $50,000 in funding.

Techniques and Technology

Specialized R&D, Distance based Similarity Analysis, Algorithm Design

Sentiment Analysis on Traveler Reviews

Business Problem

Our client was looking to prototype NLP capabilities within his organization by analyzing traveler’s experiences using review data from tripadviser.com.

Data

The data came in the form of 38,556 web scraped text reviews for traveler stays at various locations across South Korea.

Methodology

After constructing a term document matrix, we used n-gram analysis, in conjunction with sequential word analysis to assess the best not only how often certain phrases were used, but also the frequency with which certain words appeared in the same reviews.

Outcome

Our analysis formed the baseline

Techniques and Technology

Sentiment Analysis, Natural Language Processing, Text Mining

Attrition Modeling Design and Implementation

Business Problem

As part of their initiative to become a more data driven company, a large Fortune 500 national utility company engaged O.A.G. to scope, design, and implement several statistical models to support marketing efforts to get, keep, and grow their customer base. The first project within the engagement involved helping them use machine learning to deal with customer attrition issues in one of their southern jurisdictions.

Data

Accessing their customer data was the first hurdle we had to overcome. Like many companies, customer records were siloed within multiple database systems, none of with natively spoke to one another. The size of the data, over 20 GB, also posed an extra set of constraints on our ability to interact with the tables.

Methodology

Using our standardized DDEMA approach, Omni Analytics began first with defining the set of variables necessary for analysis. With these definitions, we successfully narrowed down the amount of data necessary to build the statistical models. We engaged I.T. and provided detailed instructions on how to query and transfer the data. This put us into a position where we could follow up with the exploration and modeling phase of the project. Utilizing regression and tree based methods, we were able to elucidate the drivers of attrition within their subscriber pool.

Outcome

Omni Analytics returned to the client a detailed report and a statistical model that the client could directly insert into their SAP system to create customer account flags.

Benefits

This initial project lead to an expanded engagement with the company. Currently on going arrangement, O.A.G. regularly produces dashboard visualizations, KPI calculations, statistical models, and training material.

Techniques and Technology

Advanced R Programming, Cloud Configuration, GPU Processing

Statistical Model Auditing for the Prediction of Language Translation Completion Times

Business Problem

An online translation service that leverages crowdsourced translators was interested in estimating project completion times.

Data

Their back end data source consisted of an Excel model with soft coded assumptions serving as the primary information storage and forecasting tool.

Methodology

Omni Analytics began first with an audit of the current deterministic model in place. After a careful analysis, OAG found that the current model was inadequate at estimating project completion times. Through simulation studies, it became even more apparent that the data itself was inadequate for analysis as well. Omni Analytics then created and proposed a new database structure that would more accurately characterize the business, while conforming to database standards that would support more advanced analysis.

Outcome

The client recognized the appropriate pivot in the scope we’d proposed and praised our group for being forthright with the assessment.

Techniques and Technology

Data Strategy, Process Auditing

Statistical Analysis of Revenue Drivers

Business Problem

Wanting to supplement a larger market research docket, our client was interested in learning what factors contributed to the successful execution of a trade show.

Data

Data had been collected on trade shows across six countries and within six industries including Information about the number of attendees, the fee schedule, market size, number of companies, and their social media popularity. Due to constraints on the availability of data, not every trade show had all of it’s information recorded. This created a standard “missing data problem”.

Methodology

Using multiple imputation, in conjunction with linear regression, we identified a set of statistically significant drivers that were then used to derive a trade show construction and targeting policy.

Outcome

In addition to identifying profitable country regions, our analysis backed up common insights held about trade show profitability and pointed out a few counter-intuitive ones. The final policy suggested that higher fees for exhibitors were reasonable, but not for participants. In the end, attendance is key and trade shows with larger community member pools and higher market sizes are also the ones likely to attract the most participants.

Benefits

Upon delivery, our analysis was included in the research docket and became the focal point of the strategic targeting discussion.

Techniques and Technology

Regression Analysis

Neurological Response Analysis on Lab Mice

Business Problem

As part of our mentoring initiative, we took on a project to assist a graduate student with their research into neurological responses exhibited by rats when exposed to brain altering drugs.

Data

A double blind, full factorial randomized experiment was conducted to compare the effect of the different drugs.

Outcome

In the end, the student successfully defended for their master’s degree using our analysis.

Techniques and Technology

Non-Parametric Distributional Analysis

Game Economy Creation

Business Problem

A mobile development start-up company was in need of a game economy that users could buy, sell and exchange in-game tokens for enhancements, additional playable characters, or power-ups.

Data

Leveraging our team’s deep expertise in mathematics, statistics and economics, we derived a skill based experience function that controlled user progression through the game. An additional layer of mathematics was used to derive a set of prices for the in-game items.

Outcome

Beyond the creation of the game’s economy, we created an integrated parameterized spreadsheet that allowed our client to adjust individual aspect of the pricing environment. This received substantial praise from the client and allowed them to continue iterating on their economy after the official engagement ended.

Techniques and Technology

Econometrics, Curve Fitting, Microeconomics

Bullet Matching Analysis

Business Problem

As part of Iowa State University’s big data initiative, Omni Analytics has been brought on as a partner to provide supplemental research support for the Center for Statistics and Applications in Forensic Evidence (CSAFE).

Data

As part of our contract, we’ve been tasked with developing statistical quality measures to assess the conditions of cross sectional bullet scans.

Methodology

As an on-going project, we intend to derive, implement, and validate these statistical measures against the current database of image scans.

Techniques and Technology

Advanced R Programming, High Performance Computing

Userbase Modeling

Business Problem

Online there are more dating sites that there are people to populate them, but one Australian start-up had a different take on it. Instead of mindlessly swiping, what about hosting in-person parties where invites come from those in your close network? With this premise, Omni Analytics group was contacted to help develop a mathematical model to estimate the critical mass required to simultaneously reach viral status and profitability.

Methodology

OAG discussed with the founder, in great detail, about the business model to derive a set of modeling assumptions and parameters that could influence monthly growth. With these number in tow, we setup a simulation that could estimate the long term growth of the user base.

Outcome

A grid search of parameters lead us to a breakeven point that would produce a sustainable business without requiring large numbers of initial participants or hefty attendee fees.

Benefits

After signing off on the growth model, the start-up founder was able to use our growth model to secure VC funding.

Techniques and Technology

Statistical Curve Fitting

Customer Segmentation with Psychographic Indices

Business Problem

An upstart consulting firm needed assistance with performing a complex customer segmentation for their largest client, a beer manufacturer, looking to understand the types of customers consuming their product.

Data

The data consisted of an online survey where beer drinkers discussed their preferences and consumption scenarios.

Methodology

Through careful observation, Omni Analytics Group was able to identify and exploit the natural indices found within the survey questions. With these partitions, an alternative index was created to serve as the inputs for a PCA based scoring clustering search.

Outcome

The hybrid approach worked wonderfully; successfully identifying seven distinct consumer groups each with noticeably obvious narratives well suited for marketing.

Benefits

Our worked formed the foundation of the analysis project, securing a follow up contract for the upstart consulting firm.

Techniques and Technology

K-Means Clustering, Self Organizing Maps, Principal Component Analysis, Data Visualization

Psychometrics Work Style Archetype Matching Algorithm

Business Problem

A San Francisco based human resources start-up needed data science expertise for the development of their job personality archetype matching system.

Data

As their data science mercenary team, Omni Analytics Group designed their psychometric matching framework from the ground up.

Methodology

Under our lead, their on-board industrial psychologist constructed the survey and initialized the Bayesian-esque priors while we developed and coded the archetype matching algorithm. The end result was a scalable recommender system which, after achieving over 75% accuracy in live alpha testing, was later deployed and has now matched over 30,000 beta users.

Techniques and Technology

Specialized R&D, Distance based Similarity Analysis, Data Visualization, Algorithm Design

Healthcare Cost Estimation

Business Problem

Having operated as a non-profit for 10 years in the healthcare space, our client wanted to expand its offerings of health informatic related services. They were particularly interested in completing one of their long time initiatives, the creation of a statistical modeling procedure and algorithm that would estimate the cost of care after accounting for severity and risk adjustments. Not only would these cost estimates be used to inform healthcare individuals on expected costs for individual procedures, but they would also serve as a provider accountability tool.

Data

Omni Analytics was handed cryptographically hashed claims data for over 49 different medical ailments. This data, stored on the cloud across multiple databases, was restructured for easier analysis.

Methodology

Working along side the client, we established a large workflow diagram, outlining the contingencies to account for. These included modification for the risk and severity adjustments, minimum case requirements, and limits for the amount of missingness in specific columns. This workflow was then implemented in R utilizing the tidyverse package set.

Outcome

Before the final close-out, our client was utilizing a beta version of the algorithm to produce reports for dissemination. At close-out, the clients expressed gratitude for quickly developing such a high quality, robust procedure.

Techniques and Technology

Regression Analysis, Advanced R Programming, R Package Development, High Performance Computing

Time Series Forecasting for 242,000 SKUs

Business Problem

Forecasting product demand is difficult, especially when managing over 242,000 SKUs. To assist in this task, Omni Analytics was brought in to perform time series analysis on an e-commerce site’s sales data in hopes of ultimately reducing stock-outs and over accumulation of inventory.

Data

Inside the client’s data storage systems resided five years of monthly sales data for all 242,000 products, some of which had not been on the market for even 6 months.

Methodology

To handle this big data problem, Omni Analytics developed a customized, ensembled forecasting procedure that fit ARIMA, Holts Winters and Neural Network models to each of the products in parallel. After training, the model would then assess the predictive accuracy on the hold out sample, rank the models and then return with a final sales forecast bounds.

Outcome

On live testing, the algorithm was shown to save the company $10,000 a month, with further room for improvement through model parameter refinement and the insertion of additional inventory data.

Techniques and Technology

Times Series Modeling, Neural Networks, ARIMA, Data Visualization, Parallel Processing

Segmentation for Targeted Marketing to Mothers

Business Problem

A baby product manufacturer, in order to inspire future innovations, wanted to better understand the attitude of a certain segment of their customer base, females with one or more children. As a member of a larger team, Omni Analytics was delegated as the technical lead to develop a global segmentation model that reflected the attitudes and consumer behavior of mothers.

Data

Over the course of 3 months, the online attitudinal survey was conducted using the Survey Monkey platform targeted at mothers from Brazil, China, United Kingdom, and the United States. The total number of respondents tallied at 6,531. Once extracted from the system, the data was restructured to fit standard row, column conventions for easier insertion into the R statistical programming environment.

Methodology

Omni Analytics leveraged Self Organizing Maps and K-Means algorithms to identify groups of mothers that share similar characteristics.

Outcome

The analysis found intuitive groups for targeted marketing which lead to a successful campaign.

Techniques and Technology

K-Means Clustering, Data Strategy, Management Consulting

Employee Feedback Topic Summarization with Latent Dirchlet Allocation and NLP

Business Problem

A leading for-profit managed health care company conducts a yearly engagement survey to better understand employee attitudes and thoughts toward management, company policies, and overall satisfaction.

Data

In three short questions, our client’s survey collects open text feedback from over 10,000 respondents, roughly 25% of the company’s employee base. Omni Analytics group was given access to this data and tasked with categorizing the text into meaningful, actionable summaries that management could investigate.

Methodology

With the natural language processing capabilities in R, we leveraged regular expressions, latent dirchlet allocation and shiny to clean, model and explore the data.

Outcome

Our efforts created a meaningful breakdown of the responses into 20 easily interpretable topics. We then circled back with our company contact to get business context for the word clusterings. The topics were validated and then included into a report that was shown directly to the executive team.

Techniques and Technology

Topic Analysis, Natural Language Processing, Text Mining, Latent Dirchlet Analysis

Driver Profiles from Accelerometer Data

Business Problem

Insurance companies have long wanted to better understand what goes on “behind the wheel” of their drivers. Insight into these driving patterns could help improve pricing, inspire incentive structures, and even create a framework to anticipate claims.

Data

With accelerometer data from a convenience sample, the client engaged Omni Analytics Group to analyze the data for distinct driving patterns that would be used to form the internal narrative around customer driving styles.

Methodology

Using well established clustering and visualization techniques, our team performed standard K-Means clustering and supplemented that algorithm with a Self Organizing Map. The resulting cluster solutions were analyzed visually with parallel coordinate plots, which is a common technique for accentuating the differences across multiple variables.

Outcome

At delivery, our client expressed appreciation for finding easily interpretable profiles, a task their internal group was unable to accomplish.

Techniques and Technology

K-Means Clustering, Self Organizing Maps, Principal Component Analysis, Data Visualization

Sentiment Analysis for Stock Trading

Business Problem

A start-up in the financial services space wanted to tap into the growing trend of social sentiment based stock trading using live Twitter data.

Methodology

Utilizing Twitter’s API, Omni Analytics engineers created a lightning fast dictionary based scoring model that classified stock related tweets into six emotional categories: joy, fear, disgust, sadness, surprise and anger.

Outcome

With our algorithm in tow, they created a website that provided subscribing members stock tips based on our sentiment algorithm and other industry metrics.

Techniques and Technology

Sentiment Analysis, Natural Language Processing, Text Mining

Artificial Intelligence based First Contact Optimization

Business Problem

A premier coffee subscription company wanted to leverage artificial intelligence to determine the optimal coffee roast to send to its first time customers.

Methodology

After mining the customer data for patterns, we built two independent supervised learning models for the identification of features relevant to customer lifetime value.

Outcome

The intellectual property developed for Atlas Coffee Club was fully integrated into their back-end systems and procedures were put in place to better track the lifetime value of customers given their roast recommendation.

Benefits

The company’s first contact procedure was streamlined and model estimates suggest a potential 20% or greater reduction in attrition for certain optimized scenarios.

Techniques and Technology

Reinforcement Learning, Exploratory Data Analysis, Prescriptive Analytics,  Parallel Computing, Supervised Learning

 

Back to Top