gravatar

Nazrilravipratama1609

Nazril Ravi Pratama

Recently Published

Comparative Clustering Analysis of Indonesian Provincial Socioeconomic Indicators
This document presents the R code and output for a comparative clustering analysis of 34 Indonesian provinces using K-Means, K-Medoids (PAM), DBSCAN, Mean Shift, and Fuzzy C-Means methods based on 16 socioeconomic indicators.
Comparative Analysis of PCA and Factor Analysis on Provincial Socioeconomic Indicators in Indonesia (2021–2024)
This study compares Principal Component Analysis and Factor Analysis on Indonesian provincial socioeconomic data (2021–2024). Parallel Analysis is used as the primary retention criterion. The findings highlight differences between variance-based and latent-factor approaches in explaining regional development patterns.
Titanic Task Data Science
This project is part of the INT24 assignment, which aims to develop foundational skills in using R and Rpubs for data analysis and statistical exploration. The dataset used in this task is the Titanic Dataset obtained from Kaggle https://www.kaggle.com/datasets/yasserh/titanic-dataset?select=Titanic-Dataset.csv The analysis begins with importing the dataset into R and conducting an initial exploration to understand its structure and variables. From the available features, four numerical variables were selected for further analysis: Age, SibSp, Parch, and Fare. Rows containing missing values in these variables were removed to ensure the validity of statistical computations. Several statistical analyses were performed using R. First, a correlation matrix was generated to examine the strength and direction of relationships among the selected variables. Next, a variance–covariance matrix was computed to measure how the variables vary together and to provide insight into their joint variability. Eigen values and eigen vectors were then calculated based on the covariance matrix to identify the principal directions of variance in the data, which serves as a foundation for understanding dimensionality reduction concepts such as Principal Component Analysis (PCA). Each output is interpreted to explain the relationships between variables, the scale of variability, and the contribution of each component to the overall variance of the dataset. Through this task, R is used not only as a computational tool but also as a medium for reproducible data analysis, while Rpubs is utilized as a platform to publish and share analytical results in a clear and structured format.