Recently Published

Assigning positions in blocks of repeated elements in a vector: a performance comparison in R
This exercise explores four methods for assigning positions within consecutive, repeated elements in a vector, efficiently labeling sequences of a target value while keeping other values unchanged. For example, given a vector like 0 0 0 1 1 1 1 0 0 1 1 0 0 0 1 1 1 1 0 0 0 0 0, the desired output is 0 0 0 1 2 3 4 0 0 1 2 0 0 0 1 2 3 4 0 0 0 0 0, a need that arises in various applications such as time series analysis (identifying trends and patterns in sequential data), genomic sequence processing (assigning positions in repeated nucleotides or amino acid sequences), and text data manipulation (detecting and processing repeated words, phrases, or characters). To address these diverse use cases, I developed generalized function versions for four different methods and tested their efficiency across vectors of varying lengths, scaling up to 1 × 10⁶ elements (Figs. 1 and 2). The key takeaways are: 1. Different approaches yield the same result with varying trade-offs in efficiency, readability, and flexibility. 2. rle is the best choice when speed is critical. 3. Benchmarking is essential for selecting methods in large-scale data processing. 4. Alternative implementations not covered here may further optimize performance.
waffle_ggplot2
PROYECTO ESTADÍSTICA
María Del Mar Tejada B Juan Pablo Sanchez Laura Duque
Final Project
W4L4 Annotations
Next Word Prediction App
CH4 - Campo
Peace outflows - 2023 to 2025
Outflows from below the dam(s) in 2023, 2024 and 2025 relative to IQR
HTML
Plot_distr
Plot