- What is Preddle?
- Preddle Uses
- 1 Intro
- 2 Data Upload
- 3 Workspace Settings
- 4 GLM Construct
- 5 GLM Model Fitting
- 6 GLM Grouping Categ'l
- 7 GLM Actual vs Mod'ld
- 8 GLM Spline Fitting
- 9 GLM Betas
- 10 GLM Factor Comb'ns
- 11 GLM Scoring Code
- 12 GLM Residual Plot
- 13 GLM Backup
- 14 GLM Model Compare
- 15 GLM Interactions
- 16 Machine Learning
- 17 ML Segment Def'ns
- 18 ML Correlations
- 19 ML Scoring
- 20 ML Tree Diagram
- 21 ML Two Way Segm't'n
What is Preddle?
Preddle delivers predictive capabilities (like a crystal ball) into the hands of every man, woman and child on Earth (with an internet connection).
Predictive analytics is for everyone. The democritization of predictive analytics parallels that of the world wide web, lagged by about 15 years.
15 years ago the only people who could create a personal homepage were those with some technical expertise and knowledge of HTML coding. Even uploading photos of a family holiday, publishing a personal biographical piece and selling things online required computer knowledge that very few people possessed. Back then web developers and computer scientists were the new 'rock stars' because few people had the mastery or the tools. Apart from a few hobbyists, webpages were rarely made for personal use. Businesses dominated the web because it was expensive and difficult to have a web presence.
Today, Facebook has commoditized personal home page creation to the point where anyone with an internet connection can set up a profile, upload photos of their latest holiday and keep the world informed on what's important to them. What was previously a specialized, technical and geek-dominated domain has opened up to everyone, with no technical knowledge required.
Predictive Analytics (or data based forecasting) today is where the web was 15 years ago: analytics is complicated, not well understood, dominated by business, not people, and highly technical. Today's rock stars are data scientists with PhDs in machine learning, using Predictive Analytics to help businesses sell you more stuff. They can predict who will click, die, buy or try.
But individuals like you and me can now use Preddle to understand the world around us and help us make better choices in our day to day lives. If you can use an iPhone or Facebook then you can use Preddle.
Preddle does for prediction what Facebook did for the web - opens it up to everybody:
- Simplifies: You don't need a PhD in statistics, or expensive software and hardware to run Preddle. A high school student can run predictions from their smart phone while sitting at the bus stop.
- Empowers: If you have a problem that can be solved by data driven predictions then Preddle can help you. This might be predicting which team will win the grand final, determining who is at risk of gambling addiction or completing a school assignment.
- Focuses on people: Everyone's questions about the world are different. Preddle empowers you, as an individual, to understand your specific problem.
Preddle Uses
Preddle can be used whenever you want to predict or explain an outcome based on historical data. Examples include:
- Predict the outcome of sporting events: If you have historical data on players, ground conditions, weather and other sports statistics then using the past you can model the probability of a win (or model the final score) based on what we know about upcoming games (e.g. who is playing, what the weather will be like etc). You probably know intuitively that some players perform better in the wet or in the cold, others have a strong home ground advantage while others seem to play the same regardless of whether they are home or away. Preddle empowers you to put some science behind this intuition. Quantify how much of an advantage a home team has.
- Social science research: Psychologists and other social science researchers often collect vast swathes of survey data on everything from body image to gambling addiction trying to determine what puts someone most at risk. The relationships between factors can be very complex. Is someone more at risk of addiction because of their sex or because of risk seeking tendencies? Does poor body image lead to social disengagement or does social disengagement lead to poor body image?
Most traditional social science statistical techniques are based on ANOVA, t-test and effect size calculations. These techniques are very old, devised in a pre-computer era and were designed to be calculated by hand! The traditional analytical techniques of the social sciences are bound by strong assumptions and limitations. They measure one way effects rather than multivariate effects. They either assume normality or require artificial transformations to approximate normality. They test (one way) effects but don't produce a mathematical model relating predictors to the outcome variable. Most importantly, they can be difficult to understand because they are formula driven rather than visual. Data science has come a long way since the days of ANOVA and t-tests.
Preddle takes full advantage of humans' capacity to understand charts better than tables of numbers. Preddle provides best practice data science models that are easy and intuitive to understand, provide greater explantory power, while being more statistically rigorous than the traditional ANOVA and t-test techniques common in social science academia.
This tutorial series focuses on business data (eg insurance) but a series of social science (eg psychology) tutorials are also available.
- Kaggle entries: If you're entering Kaggle competitions, the advantages of Preddle need no introduction. Skip straight to the video tutorials and jump into the detail.
- Quantified Self: The Quantified Self movement has generated reams of data from individuals logging steps taken, hours slept, calories consumed etc. The effect of daily habits on health, alertness and productivity can be extremely complex. Collecting the data is only the beginning. Use Preddle to analyze it with the best techniques known to data science.
- Business and marketing: Predictive analytics has long been used in business and marketing to understand customer behavior. But the software platforms required were expensive, had steep learning curves and often weren't visual and intuitive. Try Preddle. Watch the video tutorials on modeling insurance customer loss cost to see how Preddle can help with your predictive modeling.
- Stock market prediction: Historical price movements, announcements, financial ratios, dividend yields, economic data and qualitative information like sector categories and analyst ratings can be used as predictors of future returns. Interactions can be used to model whether some predictors are important only in the presence of others. For example, are international expansion announcements value adding for some industries and value destroying for others? Model complexity is limited only your imagination. The two way data mining feature can be used to model a risk-return tradeoff frontier.
If you have other suggestions for how Predictive Analytics can be used for your unique problems please email dion@preddle.com. We can customize tools for your purpose and create 'use case' videos illustrating how to make the best predictions.
Insurance Tutorial 1: Intro
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 2: Data Upload
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 3: Workspace Settings
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 4: GLM Construct
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 5: GLM Model Fitting
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 6: GLM Grouping Categorical
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 7: GLM Actual vs Modeled
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 8: GLM Spline Fitting
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 9: GLM Betas
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 10: GLM Factor Combinations
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 11: GLM Scoring Code
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 12: GLM Residual Plot
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 13: GLM Backup
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 14: GLM Model Compare
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 15: GLM Interactions
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 16: Machine Learning
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 17: ML Segment Definitions
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 18: ML Correlations
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 19: ML Scoring
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 20: ML Tree Diagram
The synthetic data used in this tutorial series can be downloaded here
Insurance Tutorial 21: ML Two Way Segmentation
The synthetic data used in this tutorial series can be downloaded here