The owner of the boutique consulting firm contacted Omni Analytics seeking assistance with analyzing survey data from their corporate clients; Milestone Systems, Fristads Kansas and Kamstrup.
Utilizing the Qualtrics platform for survey deployment, Omni Analytics provided value by handling the data formatting and performing the analysis with k-means clustering and natural language processing.
O.A.G. assisted the firm in becoming a leader in Denmark’s marketing analytics service industry by expanding their data science capabilities. Our direct involvement resulted in project extensions and renewed contracts for all of firms aforementioned clients.
Advanced R Programming, K-Means Clustering, Self Organizing Maps, Survey Analysis
A local online start-up provides a matching service that connects researchers to potential participants. The founder wants to expand his business to enterprise researchers, but needed a deeper understanding of his participant pool in order to highlight demographical clusters that may be interesting to his new enterprise customers.
Omni Analytics was given access to the participant pool data containing self-reported demographics.
We restructured the data, then performed an unsupervised learning analysis to uncover clusters of demographics.
At delivery, our client expressed surprise at the clarity and clear demarcation of his participant pool. He was reassured to know that he possessed a diverse set of clusters to allow his potential enterprise customers to A/B test across various demographics quickly.
These groupings immediately became automated filters within the website in anticipation of the enterprise platform launch.
K-Means Clustering, Data Strategy, Management Consulting
A popular website for tracking car fuel efficiency was in need of analysis to help determine when users were submitting incorrect data on their car’s mileage.
To collect data, users login to the website and insert the amount spent at the pump, along side their odometer readings. Unfortunately, it is all to common for users to input the wrong information. The site owner wanted a way to statistically characterize potential erroneous entries.
Our team, leveraged the outlier package in R to develop a scheme for assessing these observations. Given previously entered data for that specific car and general information from other similar users, we were able to derive statistical bounds that would identify potentially incorrect datum.
Our client implemented these error checking rules into their app as an improvement to the overall user experience.
Outlier Analysis
As another outreach project for our student assistance initiative, Omni Analytics helped a graduate student compare the efficacy of antibiotic agents on Chlamydial Strains.
For this project, the data originated from an experiment that measured the MCC and MIC responses for cervical and urethra swabs taken after patients received a set dosage of Azithromycin, Doxycycline or Levofloxacin.
Using standard parametric and non-parametric difference in mean tests, we correctly identified Doxycycline as the most potent drug to kill Chlamydial strands.
With our assistance, the graduate student successfully defended for their PhD in biomedical and health sciences.
Non-Parametric Distributional Analysis
Based in Lima, Peru, a fin-tech company was leveraging traditional statistical techniques to assist in loan disbursement for the historically unbanked. As the company grew, they sought out Omni Analytics to help push forward an initiative to incorporate machine learning best practices for the development of more accurate credit risk models derived from their psychometric assessments.
O.A.G. was initially engaged after a successful private data competition where we outperformed not only their internal team, but also six other consulting companies. We then collaborated, working side by side to formalize their modeling procedure for future scalability. As the initial initiative concluded, O.A.G. provided further research into the implementation of alternative data sources, new machine learning ensembling techniques, as well as general statistical consulting support and on-site training.
During the course of our collaboration, we reduced their modelling time from 3 days per model to less than one hour while increasing the accuracy by over 10%. With the final implementation of the automated system, they were able to construct ensemble models within minutes. During this period, they also saw over 100% growth in clients and scorable tests which were all successfully served and analyzed thanks to the new framework.
Regularized Regression, LASSO Regression, Logistic Regression, Gradient Boosted Machines, Advanced R Automation, Specialized R&D
A Fortune 500 company’s health and nutrition division, one commanding yearly revenue of over $300M, wanted to better understand the needs of their customers. As part of a consortium of consultants, Omni Analytics was charged with designing, implementing, and leading the technical portion of their global customer segmentation study.
In this study, nearly 1500 respondents across 8 global regions supplied answers to an online conjoint study requesting participants to articulate their attitude towards the pricing, lead time, ease of ordering and other aspects of the supply chain with respect to their interaction with the Fortune 500 host company.
Conducted on the Survey Monkey platform, the information was transferred onto the Omni Analytics servers where hierarchical Bayes analysis was performed. The resultant part-worths were clustered using both K-means and self organizing maps in an effort to extract meaningful, but uncorrelated clusters.
Our analysis identified intuitive customer segments such as those who are price sensitive, but care about having ample technical support as well other groups that were not price sensitive, but wanted a high quality product as shipped as fast as possible.
The client was able to develop a meaningful global strategy that attributed to the increase in year over year sales after it’s successful implementation.
Advanced R Programming, K-Means Clustering, Self Organizing Maps, Survey Analysis, Bayesian Analysis
Our client produced a monthly newsletter using manually updated Excel data and Adobe Indesign. Month after month, an already swamped system administrator would dedicate 25 hours to the cumbersome process of producing that month’s report.
Omni Analytics was hired to take ownership of the process. We redesigned the data ingestion procedure, reducing the manual procedure to nothing more than a one time copy-paste procedure. All other steps, involving the recalculation of statistics, graphs and outputs were coded into R and converted into an R Markdown report.
Upon project delivery, our automated report solution reduced the publication time from 2 weeks to only 1 day.
Rmarkdown, knitR, Advanced R Automation
A Canadian start-up was looking for technology that would allow them to recommend college majors to high school students. Working against shortened attention spans, but regulatory requirements, they needed an intelligent algorithm that would ask the fewest number of questions, but still maintain a high level of predictive accuracy.
To collect data, university students were tracked from their entrance into the university until their successful matriculation. Demographics, as well as their answers to the questionnaire were stored across time.
Drawing on previous experience, Omni Analytics assisted in establishing the field of study matching process. In addition to the creation of the questionnaire guidelines, OAG developed an innovative intelligent variable selection algorithm that uses past responses to select the most appropriate question to ask in real time.
Utilizing our algorithm, the start-up was able to secure $50,000 in funding.
Specialized R&D, Distance based Similarity Analysis, Algorithm Design
Our client was looking to prototype NLP capabilities within his organization by analyzing traveler’s experiences using review data from tripadviser.com.
The data came in the form of 38,556 web scraped text reviews for traveler stays at various locations across South Korea.
After constructing a term document matrix, we used n-gram analysis, in conjunction with sequential word analysis to assess the best not only how often certain phrases were used, but also the frequency with which certain words appeared in the same reviews.
Our analysis formed the baseline
Sentiment Analysis, Natural Language Processing, Text Mining
As part of their initiative to become a more data driven company, a large Fortune 500 national utility company engaged O.A.G. to scope, design, and implement several statistical models to support marketing efforts to get, keep, and grow their customer base. The first project within the engagement involved helping them use machine learning to deal with customer attrition issues in one of their southern jurisdictions.
Accessing their customer data was the first hurdle we had to overcome. Like many companies, customer records were siloed within multiple database systems, none of with natively spoke to one another. The size of the data, over 20 GB, also posed an extra set of constraints on our ability to interact with the tables.
Using our standardized DDEMA approach, Omni Analytics began first with defining the set of variables necessary for analysis. With these definitions, we successfully narrowed down the amount of data necessary to build the statistical models. We engaged I.T. and provided detailed instructions on how to query and transfer the data. This put us into a position where we could follow up with the exploration and modeling phase of the project. Utilizing regression and tree based methods, we were able to elucidate the drivers of attrition within their subscriber pool.
Omni Analytics returned to the client a detailed report and a statistical model that the client could directly insert into their SAP system to create customer account flags.
This initial project lead to an expanded engagement with the company. Currently on going arrangement, O.A.G. regularly produces dashboard visualizations, KPI calculations, statistical models, and training material.
Advanced R Programming, Cloud Configuration, GPU Processing
An online translation service that leverages crowdsourced translators was interested in estimating project completion times.
Their back end data source consisted of an Excel model with soft coded assumptions serving as the primary information storage and forecasting tool.
Omni Analytics began first with an audit of the current deterministic model in place. After a careful analysis, OAG found that the current model was inadequate at estimating project completion times. Through simulation studies, it became even more apparent that the data itself was inadequate for analysis as well. Omni Analytics then created and proposed a new database structure that would more accurately characterize the business, while conforming to database standards that would support more advanced analysis.
The client recognized the appropriate pivot in the scope we’d proposed and praised our group for being forthright with the assessment.
Data Strategy, Process Auditing
Wanting to supplement a larger market research docket, our client was interested in learning what factors contributed to the successful execution of a trade show.
Data had been collected on trade shows across six countries and within six industries including Information about the number of attendees, the fee schedule, market size, number of companies, and their social media popularity. Due to constraints on the availability of data, not every trade show had all of it’s information recorded. This created a standard “missing data problem”.
Using multiple imputation, in conjunction with linear regression, we identified a set of statistically significant drivers that were then used to derive a trade show construction and targeting policy.
In addition to identifying profitable country regions, our analysis backed up common insights held about trade show profitability and pointed out a few counter-intuitive ones. The final policy suggested that higher fees for exhibitors were reasonable, but not for participants. In the end, attendance is key and trade shows with larger community member pools and higher market sizes are also the ones likely to attract the most participants.
Upon delivery, our analysis was included in the research docket and became the focal point of the strategic targeting discussion.
Regression Analysis
As part of our mentoring initiative, we took on a project to assist a graduate student with their research into neurological responses exhibited by rats when exposed to brain altering drugs.
A double blind, full factorial randomized experiment was conducted to compare the effect of the different drugs.
In the end, the student successfully defended for their master’s degree using our analysis.
Non-Parametric Distributional Analysis
A mobile development start-up company was in need of a game economy that users could buy, sell and exchange in-game tokens for enhancements, additional playable characters, or power-ups.
Leveraging our team’s deep expertise in mathematics, statistics and economics, we derived a skill based experience function that controlled user progression through the game. An additional layer of mathematics was used to derive a set of prices for the in-game items.
Beyond the creation of the game’s economy, we created an integrated parameterized spreadsheet that allowed our client to adjust individual aspect of the pricing environment. This received substantial praise from the client and allowed them to continue iterating on their economy after the official engagement ended.
Econometrics, Curve Fitting, Microeconomics
As part of Iowa State University’s big data initiative, Omni Analytics has been brought on as a partner to provide supplemental research support for the Center for Statistics and Applications in Forensic Evidence (CSAFE).
As part of our contract, we’ve been tasked with developing statistical quality measures to assess the conditions of cross sectional bullet scans.
As an on-going project, we intend to derive, implement, and validate these statistical measures against the current database of image scans.
Advanced R Programming, High Performance Computing
Online there are more dating sites that there are people to populate them, but one Australian start-up had a different take on it. Instead of mindlessly swiping, what about hosting in-person parties where invites come from those in your close network? With this premise, Omni Analytics group was contacted to help develop a mathematical model to estimate the critical mass required to simultaneously reach viral status and profitability.
OAG discussed with the founder, in great detail, about the business model to derive a set of modeling assumptions and parameters that could influence monthly growth. With these number in tow, we setup a simulation that could estimate the long term growth of the user base.
A grid search of parameters lead us to a breakeven point that would produce a sustainable business without requiring large numbers of initial participants or hefty attendee fees.
After signing off on the growth model, the start-up founder was able to use our growth model to secure VC funding.
Statistical Curve Fitting
An upstart consulting firm needed assistance with performing a complex customer segmentation for their largest client, a beer manufacturer, looking to understand the types of customers consuming their product.
The data consisted of an online survey where beer drinkers discussed their preferences and consumption scenarios.
Through careful observation, Omni Analytics Group was able to identify and exploit the natural indices found within the survey questions. With these partitions, an alternative index was created to serve as the inputs for a PCA based scoring clustering search.
The hybrid approach worked wonderfully; successfully identifying seven distinct consumer groups each with noticeably obvious narratives well suited for marketing.
Our worked formed the foundation of the analysis project, securing a follow up contract for the upstart consulting firm.
K-Means Clustering, Self Organizing Maps, Principal Component Analysis, Data Visualization
A San Francisco based human resources start-up needed data science expertise for the development of their job personality archetype matching system.
As their data science mercenary team, Omni Analytics Group designed their psychometric matching framework from the ground up.
Under our lead, their on-board industrial psychologist constructed the survey and initialized the Bayesian-esque priors while we developed and coded the archetype matching algorithm. The end result was a scalable recommender system which, after achieving over 75% accuracy in live alpha testing, was later deployed and has now matched over 30,000 beta users.
Specialized R&D, Distance based Similarity Analysis, Data Visualization, Algorithm Design
Having operated as a non-profit for 10 years in the healthcare space, our client wanted to expand its offerings of health informatic related services. They were particularly interested in completing one of their long time initiatives, the creation of a statistical modeling procedure and algorithm that would estimate the cost of care after accounting for severity and risk adjustments. Not only would these cost estimates be used to inform healthcare individuals on expected costs for individual procedures, but they would also serve as a provider accountability tool.
Omni Analytics was handed cryptographically hashed claims data for over 49 different medical ailments. This data, stored on the cloud across multiple databases, was restructured for easier analysis.
Working along side the client, we established a large workflow diagram, outlining the contingencies to account for. These included modification for the risk and severity adjustments, minimum case requirements, and limits for the amount of missingness in specific columns. This workflow was then implemented in R utilizing the tidyverse package set.
Before the final close-out, our client was utilizing a beta version of the algorithm to produce reports for dissemination. At close-out, the clients expressed gratitude for quickly developing such a high quality, robust procedure.
Regression Analysis, Advanced R Programming, R Package Development, High Performance Computing
Forecasting product demand is difficult, especially when managing over 242,000 SKUs. To assist in this task, Omni Analytics was brought in to perform time series analysis on an e-commerce site’s sales data in hopes of ultimately reducing stock-outs and over accumulation of inventory.
Inside the client’s data storage systems resided five years of monthly sales data for all 242,000 products, some of which had not been on the market for even 6 months.
To handle this big data problem, Omni Analytics developed a customized, ensembled forecasting procedure that fit ARIMA, Holts Winters and Neural Network models to each of the products in parallel. After training, the model would then assess the predictive accuracy on the hold out sample, rank the models and then return with a final sales forecast bounds.
On live testing, the algorithm was shown to save the company $10,000 a month, with further room for improvement through model parameter refinement and the insertion of additional inventory data.
Times Series Modeling, Neural Networks, ARIMA, Data Visualization, Parallel Processing
A baby product manufacturer, in order to inspire future innovations, wanted to better understand the attitude of a certain segment of their customer base, females with one or more children. As a member of a larger team, Omni Analytics was delegated as the technical lead to develop a global segmentation model that reflected the attitudes and consumer behavior of mothers.
Over the course of 3 months, the online attitudinal survey was conducted using the Survey Monkey platform targeted at mothers from Brazil, China, United Kingdom, and the United States. The total number of respondents tallied at 6,531. Once extracted from the system, the data was restructured to fit standard row, column conventions for easier insertion into the R statistical programming environment.
Omni Analytics leveraged Self Organizing Maps and K-Means algorithms to identify groups of mothers that share similar characteristics.
The analysis found intuitive groups for targeted marketing which lead to a successful campaign.
K-Means Clustering, Data Strategy, Management Consulting
A leading for-profit managed health care company conducts a yearly engagement survey to better understand employee attitudes and thoughts toward management, company policies, and overall satisfaction.
In three short questions, our client’s survey collects open text feedback from over 10,000 respondents, roughly 25% of the company’s employee base. Omni Analytics group was given access to this data and tasked with categorizing the text into meaningful, actionable summaries that management could investigate.
With the natural language processing capabilities in R, we leveraged regular expressions, latent dirchlet allocation and shiny to clean, model and explore the data.
Our efforts created a meaningful breakdown of the responses into 20 easily interpretable topics. We then circled back with our company contact to get business context for the word clusterings. The topics were validated and then included into a report that was shown directly to the executive team.
Topic Analysis, Natural Language Processing, Text Mining, Latent Dirchlet Analysis
Insurance companies have long wanted to better understand what goes on “behind the wheel” of their drivers. Insight into these driving patterns could help improve pricing, inspire incentive structures, and even create a framework to anticipate claims.
With accelerometer data from a convenience sample, the client engaged Omni Analytics Group to analyze the data for distinct driving patterns that would be used to form the internal narrative around customer driving styles.
Using well established clustering and visualization techniques, our team performed standard K-Means clustering and supplemented that algorithm with a Self Organizing Map. The resulting cluster solutions were analyzed visually with parallel coordinate plots, which is a common technique for accentuating the differences across multiple variables.
At delivery, our client expressed appreciation for finding easily interpretable profiles, a task their internal group was unable to accomplish.
K-Means Clustering, Self Organizing Maps, Principal Component Analysis, Data Visualization
A start-up in the financial services space wanted to tap into the growing trend of social sentiment based stock trading using live Twitter data.
Utilizing Twitter’s API, Omni Analytics engineers created a lightning fast dictionary based scoring model that classified stock related tweets into six emotional categories: joy, fear, disgust, sadness, surprise and anger.
With our algorithm in tow, they created a website that provided subscribing members stock tips based on our sentiment algorithm and other industry metrics.
Sentiment Analysis, Natural Language Processing, Text Mining
A premier coffee subscription company wanted to leverage artificial intelligence to determine the optimal coffee roast to send to its first time customers.
After mining the customer data for patterns, we built two independent supervised learning models for the identification of features relevant to customer lifetime value.
The intellectual property developed for Atlas Coffee Club was fully integrated into their back-end systems and procedures were put in place to better track the lifetime value of customers given their roast recommendation.
The company’s first contact procedure was streamlined and model estimates suggest a potential 20% or greater reduction in attrition for certain optimized scenarios.
Reinforcement Learning, Exploratory Data Analysis, Prescriptive Analytics,
Comments are closed.