Skip to content

JaimeVRodriguez/sof_selection_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Special Operations Forces Assessment and Selection

TO BE OR NOT TO BE SELECTED



Table of Contents





Introduction


This data is comprised of 12 different excel files spanning 6 years of selection and assesment results for the Special Operations (SOF) fields of Special Forces (SF) and Psychological Operations (PSYOP). The data is aggreted through a series of people from the application and selection process for these SOF fields.




Data Processing


Due to the aggregation process of this data, there were several human errors and difference in much of the data. Several steps where needed in the data cleaning process.

  • 19 columns were removed
    • Personal Data: Personal Identifiable Information such as name, DOB, SSN's, etc were removed in order to not identify individuals as well as comply with Special Operations personal identity policies.
    • Additional columns were deemd unneeded for the purposes of this analyis. These columns were either additional non-identifiable personal information or information that related specifically to selection and not to the individuals attending selection
  • Missing Information
    • Records not containing all information pertained to an individual were removed in order to non alter or skew the data
    • 16 POAS Records removed
    • 231 SFAS Records removed

Overview


Key Highlights

  • ~2635 individuals attended POAS
    • 748 selected: 28.3% Select Rate
  • ~8116 individuals attended SFAS
    • 1363 selected: 16.8% Select Rate




Age

Key Highlights

  • POAS
    • Average age of 28.5
  • SFAS
    • Average age of 29




Key Features

Key Highlights

  • POAS

    • PT Median score of 271
    • GT Median score of 117
    • CO Median score of 117
  • SFAS

    • PT Median score of 285
    • GT Median score of 117
    • CO Median score of 117
  • POAS CO Q3 matched that of SFAS at 124





Correlation






Hypothesis


$H_O$: GT Score does not determine whether someone is selected
$H_A$: GT Score determines whether someone is selected

$H_O$: PT Score does not determine whether someone is selected
$H_A$: PT Score determines whether someone is selected

$H_O$: Age does not determine whether someone is selected
$H_A$: Age determines whether someone is selected

$H_O$: Exisiting Language skill does not determine whether someone is selected
$H_A$: Exisiting Language skill determines whether someone is selected


T-Test

T-Test conducted on features based on whether or not an individual was selected.

The following are the p-values with their corresponding feature:
GT: 2.5269360674271627e-40

  • Reject $H_O$

PT: 7.822031090768901e-09

  • Reject $H_O$

AGE: 4.113012469721914e-06

  • Reject $H_O$

LANG: 0.6882465259249742

  • Cannot reject $H_O$

Distribution





Logistic Model


POAS

Significant Features

  • Whether someone is AIRBORNE qualified
  • Someones AGE
  • GT Score
  • SC Score
  • Whether someone needed a Waiver

Single Logistic Regression Model PSYOP

Logistic Regression with KFold


SFAS

Significant Features

  • Someones RANK
  • Whether someone is AIRBORNE qualified
  • Whether someone is RANGER qualified
  • The amount of DEPENDENTS someone has
  • Someones AGE
  • ST Score
  • Whether someone needed a WAIVER

Single Logistic Regression Model PSYOP

Logistic Regression with KFold



Significant Features

Model with all features only performed slightly better than chance. Meaning there is no better result than a random guess. There could be a couple of different factors. Lack of data, only 10,000 records is not a sufficiant amount in order to properly train a model. Secondly, there are additional factors that should be included or considered when determining the selection results of an individual. These features include things like team work, leadership, communication, etc.

However, when narrowing down the model to only existing features that are significant, the model does perform better. However, one must take into consideration what they want out of the model. A predictive model is not always the "best" model.

About

Exploratory and Statistical Analysis on PSYOP Assessment and Selection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors