gravatar

danicoder

Daniel

Recently Published

NYC 2013 Flights Data Cleaning
The main objective of this cleaning process is to transform the raw, segmented temporal data of the NYC 2013 flights dataset into a continuous and usable timeline. By converting military-style integers into formal time objects and reconciling scheduled times with their respective delays, the script aims to create accurate, high-fidelity datetime columns. A critical component of this objective is the implementation of logical corrections for overnight flights, ensuring that arrivals occurring after midnight are correctly attributed to the following calendar day, thereby maintaining temporal integrity for subsequent analysis.
Title: Advanced Relational Data Manipulation: Joins & Time-Series Analysis in R
This report explores the various levels of join operations using the tidyverse and nycflights13 datasets. It transitions from basic mutating and filtering joins to complex non-equi joins and self-joins. Key highlights include: - Custom datetime transformation from integer (HMM) formats. - Advanced join_by() syntax for inequality and rolling window matches. - Practical applications of anti-joins for data integrity and semi-joins for filtering. - Self-joins with restricted time windows to analyze consecutive flight departures.
EDA (Exploratory Data Analysys) of Bike Buyers dataset performed with R
Cleaning and Analysis of a dataset from kaggle to practice data skills using R.