Preddle

About Preddle Social Science Tutorials

What is Preddle?


Preddle delivers predictive capabilities (like a crystal ball) into the hands of every man, woman and child on Earth (with an internet connection).

Predictive analytics is for everyone. The democritization of predictive analytics parallels that of the world wide web, lagged by about 15 years.

15 years ago the only people who could create a personal homepage were those with some technical expertise and knowledge of HTML coding. Even uploading photos of a family holiday, publishing a personal biographical piece and selling things online required computer knowledge that very few people possessed. Back then web developers and computer scientists were the new 'rock stars' because few people had the mastery or the tools. Apart from a few hobbyists, webpages were rarely made for personal use. Businesses dominated the web because it was expensive and difficult to have a web presence.

Today, Facebook has commoditized personal home page creation to the point where anyone with an internet connection can set up a profile, upload photos of their latest holiday and keep the world informed on what's important to them. What was previously a specialized, technical and geek-dominated domain has opened up to everyone, with no technical knowledge required.

Predictive Analytics (or data based forecasting) today is where the web was 15 years ago: analytics is complicated, not well understood, dominated by business, not people, and highly technical. Today's rock stars are data scientists with PhDs in machine learning, using Predictive Analytics to help businesses sell you more stuff. They can predict who will click, die, buy or try.

But individuals like you and me can now use Preddle to understand the world around us and help us make better choices in our day to day lives. If you can use an iPhone or Facebook then you can use Preddle.

Preddle does for prediction what Facebook did for the web - opens it up to everybody:
- Simplifies: You don't need a PhD in statistics, or expensive software and hardware to run Preddle. A high school student can run predictions from their smart phone while sitting at the bus stop.
- Empowers: If you have a problem that can be solved by data driven predictions then Preddle can help you. This might be predicting which team will win the grand final, determining who is at risk of gambling addiction or completing a school assignment.
- Focuses on people: Everyone's questions about the world are different. Preddle empowers you, as an individual, to understand your specific problem.

Preddle Uses


Preddle can be used whenever you want to predict or explain an outcome based on historical data. Examples include:

- Predict the outcome of sporting events: If you have historical data on players, ground conditions, weather and other sports statistics then using the past you can model the probability of a win (or model the final score) based on what we know about upcoming games (e.g. who is playing, what the weather will be like etc). You probably know intuitively that some players perform better in the wet or in the cold, others have a strong home ground advantage while others seem to play the same regardless of whether they are home or away. Preddle empowers you to put some science behind this intuition. Quantify how much of an advantage a home team has.

- Social science research: Psychologists and other social science researchers often collect vast swathes of survey data on everything from body image to gambling addiction trying to determine what puts someone most at risk. The relationships between factors can be very complex. Is someone more at risk of addiction because of their sex or because of risk seeking tendencies? Does poor body image lead to social disengagement or does social disengagement lead to poor body image?
   Most traditional social science statistical techniques are based on ANOVA, t-test and effect size calculations. These techniques are very old, devised in a pre-computer era and were designed to be calculated by hand! The traditional analytical techniques of the social sciences are bound by strong assumptions and limitations. They measure one way effects rather than multivariate effects. They either assume normality or require artificial transformations to approximate normality. They test (one way) effects but don't produce a mathematical model relating predictors to the outcome variable. Most importantly, they can be difficult to understand because they are formula driven rather than visual. Data science has come a long way since the days of ANOVA and t-tests.
  Preddle takes full advantage of humans' capacity to understand charts better than tables of numbers. Preddle provides best practice data science models that are easy and intuitive to understand, provide greater explantory power, while being more statistically rigorous than the traditional ANOVA and t-test techniques common in social science academia.
   There's a plethora of social science data on the web for download and modeling. One great resource is http://www.icpsr.umich.edu/icpsrweb/SAMHDA/download

- Kaggle entries: If you're entering Kaggle competitions, the advantages of Preddle need no introduction. Skip straight to the video tutorials and jump into the detail.

- Quantified Self: The Quantified Self movement has generated reams of data from individuals logging steps taken, hours slept, calories consumed etc. The effect of daily habits on health, alertness and productivity can be extremely complex. Collecting the data is only the beginning. Use Preddle to analyze it with the best techniques known to data science.

- Business and marketing: Predictive analytics has long been used in business and marketing to understand customer behavior. But the software platforms required were expensive, had steep learning curves and often weren't visual and intuitive. Try Preddle. Watch the video tutorials on modeling insurance customer loss cost to see how Preddle can help with your predictive modeling.
   This tutorial series focuses on social science data (eg psyschology) but a series of business data (eg insurance) tutorials are also available.

- Stock market prediction: Historical price movements, announcements, financial ratios, dividend yields, economic data and qualitative information like sector categories and analyst ratings can be used as predictors of future returns. Interactions can be used to model whether some predictors are important only in the presence of others. For example, are international expansion announcements value adding for some industries and value destroying for others? Model complexity is limited only your imagination. The two way data mining feature can be used to model a risk-return tradeoff frontier.

If you have other suggestions for how Predictive Analytics can be used for your unique problems please email dion@preddle.com. We can customize tools for your purpose and create 'use case' videos illustrating how to make the best predictions.
A Business data tutorial series is also available


Social Science Tutorial 1: Intro



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 2: Social Science Research



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 3: Objectives



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 4: Data Preparation



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 5: Data Upload



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 6: Workspace Settings



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 7: Predictor Settings



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 8: GLM Model Construct



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 9: GLM Low Sample Size



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 10: GLM Fitting Models



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 11: GLM Mediating Variables



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 12: GLM Confidence Intervals



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 13: GLM Grouping Categ'l Vars



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 14: GLM Actual Vs Modeled



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 15: GLM Spline Fitting



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 16: GLM Betas



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 17: GLM Interpretation



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 18: GLM Factor Combinat'ns



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 19: GLM Scoring Code



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 20: GLM GLM Residuals



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 21: GLM Model Compare



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 22: GLM Interactions



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 23: Data Mine Intro



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 24: DM Definitions



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 25: DM Correlations



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 26: DM Scoring Code



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 27: DM Tree



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here

Social Science Tutorial 28: DM Two Way Segmentation



The raw data used in this tutorial series can be downloaded here

The recoded and mapped data used in this tutorial series can be downloaded here