This project applies statistical analysis to examine the factors that influence tourism contribution using a provided tourism dataset. The main objective is to identify whether selected tourism indicators have an impact on tourism contribution and to understand how these factors relate to each other.
The analysis was conducted as part of a final project on Multiple Linear Regression, using SPSS as the primary statistical tool.
The purpose of this study is to:
- Identify key factors that influence tourism contribution
- Examine relationships among tourism-related variables
- Apply multiple linear regression to assess the impact of selected independent variables on tourism contribution
The dataset contains tourism-related indicators across multiple observations. Key variables include:
- Tourism contribution percentage
- Inbound tourist arrivals
- Outbound tourist departures
- Employment in the tourism sector
- Other related tourism indicators The dataset initially contained missing values, which were addressed during the data preparation stage.
- The dataset was first examined in Excel to understand its structure and identify missing values.
- Data cleaning was performed in SPSS.
- Missing values in selected variables were treated using the mean substitution method.
- A dummy variable (high_inbound) was created to represent countries with high inbound tourism levels (1 = high, 0 = otherwise).
Dependent Variable
- tourism_percent – Measures tourism contribution
Independent Variables
- inbound_arrivals
- outbound_departure
- employment
- high_inbound (Dummy variable: 0/1)
The following statistical techniques were applied using SPSS:
- Descriptive Statistics
- Mean
- Median
- Standard Deviation
- Minimum
- Maximum
- Correlation Analysis
- Pearson correlation coefficients were used to examine relationships among variables.
- Multiple Linear Regression
- Used to assess the impact of independent variables on tourism contribution.
- Model diagnostics such as R-square, ANOVA, coefficients, and multicollinearity checks were examined.
Key Findings
- Descriptive statistics showed wide variation across tourism indicators.
- Correlation analysis revealed mostly weak relationships among variables.
- The regression model explained a small proportion of the variation in tourism contribution.
- None of the independent variables showed a statistically significant impact on tourism contribution at the 5% significance level.
- The results suggest that other factors not included in the dataset may play a stronger role in influencing tourism contribution.
- SPSS – Data cleaning and statistical analysis
- Microsoft Excel – Initial data inspection
This project demonstrates the application of descriptive statistics, correlation analysis, and multiple linear regression using SPSS. Although the regression results were not statistically significant, the study provides a clear example of how statistical methods can be used to analyze tourism-related data and interpret results objectively.