References

Agrawal, Rakesh, Tomasz Imielinski, and Arun Swami. 1993. “Mining Association Rules Between Sets of Items in Large Databases.” In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, 207–16. Washington, D.C., United States: ACM Press.
Bates, Douglas, Martin Maechler, and Mikael Jagan. 2024. Matrix: Sparse and Dense Matrix Classes and Methods. https://Matrix.R-forge.R-project.org.
Blake, Catherine L., and Christopher J. Merz. 1998. UCI Repository of Machine Learning Databases. Irvine, CA: University of California, Irvine, Department of Information; Computer Sciences.
Breiman, Leo, Adele Cutler, Andy Liaw, and Matthew Wiener. 2024. randomForest: Breiman and Cutlers Random Forests for Classification and Regression. https://www.stat.berkeley.edu/~breiman/RandomForests/.
Buchta, Christian, and Michael Hahsler. 2024. arulesSequences: Mining Frequent Sequences. https://CRAN.R-project.org/package=arulesSequences.
Carr, Dan, Nicholas Lewin-Koh, and Martin Maechler. 2024. Hexbin: Hexagonal Binning Routines. https://github.com/edzer/hexbin.
Chen, Tianqi, Tong He, Michael Benesty, Vadim Khotilovich, Yuan Tang, Hyunsu Cho, Kailong Chen, et al. 2024. Xgboost: Extreme Gradient Boosting. https://github.com/dmlc/xgboost.
Chen, Ying-Ju, Fadel M. Megahed, L. Allison Jones-Farmer, and Steven E. Rigdon. 2023. Basemodels: Baseline Models for Classification and Regression. https://github.com/Ying-Ju/basemodels.
Fraley, Chris, Adrian E. Raftery, and Luca Scrucca. 2024. Mclust: Gaussian Mixture Modelling for Model-Based Clustering, Classification, and Density Estimation. https://mclust-org.github.io/mclust/.
Friedman, Jerome, Trevor Hastie, Rob Tibshirani, Balasubramanian Narasimhan, Kenneth Tay, Noah Simon, and James Yang. 2023. Glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models. https://glmnet.stanford.edu.
Friedman, Jerome, Robert Tibshirani, and Trevor Hastie. 2010. “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software 33 (1): 1–22. https://doi.org/10.18637/jss.v033.i01.
Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. https://www.jstatsoft.org/v40/i03/.
Hahsler, Michael. 2017a. “An Experimental Comparison of Seriation Methods for One-Mode Two-Way Data.” European Journal of Operational Research 257 (1): 133–43. https://doi.org/10.1016/j.ejor.2016.08.066.
———. 2017b. “ArulesViz: Interactive Visualization of Association Rules with R.” R Journal 9 (2): 163–75. https://doi.org/10.32614/RJ-2017-047.
———. 2021. An R Companion for Introduction to Data Mining. Online Book. https://mhahsler.github.io/Introduction_to_Data_Mining_R_Examples/book.
———. 2024. arulesViz: Visualizing Association Rules and Frequent Itemsets. https://github.com/mhahsler/arulesViz.
Hahsler, Michael, Christian Buchta, Bettina Gruen, and Kurt Hornik. 2024. Arules: Mining Association Rules and Frequent Itemsets. https://github.com/mhahsler/arules.
Hahsler, Michael, Christian Buchta, and Kurt Hornik. 2024. Seriation: Infrastructure for Ordering Objects Using Seriation. https://github.com/mhahsler/seriation.
Hahsler, Michael, Sudheer Chelluboina, Kurt Hornik, and Christian Buchta. 2011. “The Arules r-Package Ecosystem: Analyzing Interesting Patterns from Large Transaction Datasets.” Journal of Machine Learning Research 12: 1977–81. https://jmlr.csail.mit.edu/papers/v12/hahsler11a.html.
Hahsler, Michael, Bettina Gruen, and Kurt Hornik. 2005. “Arules – A Computational Environment for Mining Association Rules and Frequent Item Sets.” Journal of Statistical Software 14 (15): 1–25. https://doi.org/10.18637/jss.v014.i15.
Hahsler, Michael, Bettina Grün, and Kurt Hornik. 2005. “Arules – A Computational Environment for Mining Association Rules and Frequent Item Sets.” Journal of Statistical Software 14 (15): 1–25. http://www.jstatsoft.org/v14/i15/.
Hahsler, Michael, Kurt Hornik, and Christian Buchta. 2008. “Getting Things in Order: An Introduction to the r Package Seriation.” Journal of Statistical Software 25 (3): 1–34. https://doi.org/10.18637/jss.v025.i03.
Hahsler, Michael, and Matthew Piekenbrock. 2024. Dbscan: Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Related Algorithms. https://github.com/mhahsler/dbscan.
Hahsler, Michael, Matthew Piekenbrock, and Derek Doran. 2019. dbscan: Fast Density-Based Clustering with R.” Journal of Statistical Software 91 (1): 1–30. https://doi.org/10.18637/jss.v091.i01.
Hastie, Trevor, and Brad Efron. 2022. Lars: Least Angle Regression, Lasso and Forward Stagewise. https://doi.org/10.1214/009053604000000067.
Hennig, Christian. 2024. Fpc: Flexible Procedures for Clustering. https://www.unibo.it/sitoweb/christian.hennig/en/.
Hornik, Kurt. 2023. RWeka: R/Weka Interface. https://CRAN.R-project.org/package=RWeka.
Hornik, Kurt, Christian Buchta, and Achim Zeileis. 2009. “Open-Source Machine Learning: R Meets Weka.” Computational Statistics 24 (2): 225–32. https://doi.org/10.1007/s00180-008-0119-7.
Horst, Allison, Alison Hill, and Kristen Gorman. 2022. Palmerpenguins: Palmer Archipelago (Antarctica) Penguin Data. https://allisonhorst.github.io/palmerpenguins/.
Hothorn, Torsten, Peter Buehlmann, Sandrine Dudoit, Annette Molinaro, and Mark Van Der Laan. 2006. “Survival Ensembles.” Biostatistics 7 (3): 355–73. https://doi.org/10.1093/biostatistics/kxj011.
Hothorn, Torsten, Kurt Hornik, Carolin Strobl, and Achim Zeileis. 2024. Party: A Laboratory for Recursive Partytioning. http://party.R-forge.R-project.org.
Hothorn, Torsten, Kurt Hornik, and Achim Zeileis. 2006. “Unbiased Recursive Partitioning: A Conditional Inference Framework.” Journal of Computational and Graphical Statistics 15 (3): 651–74. https://doi.org/10.1198/106186006X133933.
Karatzoglou, Alexandros, Alex Smola, and Kurt Hornik. 2024. Kernlab: Kernel-Based Machine Learning Lab. https://CRAN.R-project.org/package=kernlab.
Karatzoglou, Alexandros, Alex Smola, Kurt Hornik, and Achim Zeileis. 2004. “Kernlab – an S4 Package for Kernel Methods in R.” Journal of Statistical Software 11 (9): 1–20. https://doi.org/10.18637/jss.v011.i09.
Kassambara, Alboukadel. 2023. Ggcorrplot: Visualization of a Correlation Matrix Using Ggplot2. http://www.sthda.com/english/wiki/ggcorrplot-visualization-of-a-correlation-matrix-using-ggplot2.
Kassambara, Alboukadel, and Fabian Mundt. 2020. Factoextra: Extract and Visualize the Results of Multivariate Data Analyses. http://www.sthda.com/english/rpkgs/factoextra.
Kuhn, Max. 2023. Caret: Classification and Regression Training. https://github.com/topepo/caret/.
Kuhn, Max, and Ross Quinlan. 2023. C50: C5.0 Decision Trees and Rule-Based Models. https://topepo.github.io/C5.0/.
Kuhn, and Max. 2008. “Building Predictive Models in r Using the Caret Package.” Journal of Statistical Software 28 (5): 1–26. https://doi.org/10.18637/jss.v028.i05.
Leisch, Friedrich, and Evgenia Dimitriadou. 2024. Mlbench: Machine Learning Benchmark Problems. https://CRAN.R-project.org/package=mlbench.
Liaw, Andy, and Matthew Wiener. 2002. “Classification and Regression by randomForest.” R News 2 (3): 18–22. https://CRAN.R-project.org/doc/Rnews/.
Maechler, Martin, Peter Rousseeuw, Anja Struyf, and Mia Hubert. 2023. Cluster: "Finding Groups in Data": Cluster Analysis Extended Rousseeuw Et Al. https://svn.r-project.org/R-packages/trunk/cluster/.
Meyer, David, and Christian Buchta. 2022. Proxy: Distance and Similarity Measures. https://CRAN.R-project.org/package=proxy.
Meyer, David, Evgenia Dimitriadou, Kurt Hornik, Andreas Weingessel, and Friedrich Leisch. 2024. E1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. https://CRAN.R-project.org/package=e1071.
Milborrow, Stephen. 2024. Rpart.plot: Plot Rpart Models: An Enhanced Version of Plot.rpart. http://www.milbo.org/rpart-plot/index.html.
Müller, Kirill, and Hadley Wickham. 2023. Tibble: Simple Data Frames. https://tibble.tidyverse.org/.
R Core Team. 2024. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Ripley, Brian. 2023. Nnet: Feed-Forward Neural Networks and Multinomial Log-Linear Models. http://www.stats.ox.ac.uk/pub/MASS4/.
Ripley, Brian, and Bill Venables. 2024. MASS: Support Functions and Datasets for Venables and Ripley’s MASS. http://www.stats.ox.ac.uk/pub/MASS4/.
Robin, Xavier, Natacha Turck, Alexandre Hainard, Natalia Tiberti, Frédérique Lisacek, Jean-Charles Sanchez, and Markus Müller. 2011. “pROC: An Open-Source Package for r and s+ to Analyze and Compare ROC Curves.” BMC Bioinformatics 12: 77.
———. 2023. pROC: Display and Analyze ROC Curves. https://xrobin.github.io/pROC/.
Roever, Christian, Nils Raabe, Karsten Luebke, Uwe Ligges, Gero Szepannek, Marc Zentgraf, and David Meyer. 2023. klaR: Classification and Visualization. https://statistik.tu-dortmund.de.
Romanski, Piotr, Lars Kotthoff, and Patrick Schratz. 2023. FSelector: Selecting Attributes. https://github.com/larskotthoff/fselector.
Sarkar, Deepayan. 2008. Lattice: Multivariate Data Visualization with r. New York: Springer. http://lmdvr.r-forge.r-project.org.
———. 2023. Lattice: Trellis Graphics for r. https://lattice.r-forge.r-project.org/.
Schloerke, Barret, Di Cook, Joseph Larmarange, Francois Briatte, Moritz Marbach, Edwin Thoen, Amos Elberg, and Jason Crowley. 2024. GGally: Extension to Ggplot2. https://ggobi.github.io/ggally/.
Scrucca, Luca, Chris Fraley, T. Brendan Murphy, and Adrian E. Raftery. 2023. Model-Based Clustering, Classification, and Density Estimation Using mclust in R. Chapman; Hall/CRC. https://doi.org/10.1201/9781003277965.
Sievert, Carson. 2020. Interactive Web-Based Data Visualization with r, Plotly, and Shiny. Chapman; Hall/CRC. https://plotly-r.com.
Sievert, Carson, Chris Parmer, Toby Hocking, Scott Chamberlain, Karthik Ram, Marianne Corvellec, and Pedro Despouy. 2024. Plotly: Create Interactive Web Graphics via Plotly.js. https://plotly-r.com.
Simon, Noah, Jerome Friedman, Robert Tibshirani, and Trevor Hastie. 2011. “Regularization Paths for Cox’s Proportional Hazards Model via Coordinate Descent.” Journal of Statistical Software 39 (5): 1–13. https://doi.org/10.18637/jss.v039.i05.
Spinu, Vitalie, Garrett Grolemund, and Hadley Wickham. 2023. Lubridate: Make Dealing with Dates a Little Easier. https://lubridate.tidyverse.org.
Strobl, Carolin, Anne-Laure Boulesteix, Thomas Kneib, Thomas Augustin, and Achim Zeileis. 2008. “Conditional Variable Importance for Random Forests.” BMC Bioinformatics 9 (307). https://doi.org/10.1186/1471-2105-9-307.
Strobl, Carolin, Anne-Laure Boulesteix, Achim Zeileis, and Torsten Hothorn. 2007. “Bias in Random Forest Variable Importance Measures: Illustrations, Sources and a Solution.” BMC Bioinformatics 8 (25). https://doi.org/10.1186/1471-2105-8-25.
Tan, Pang-Ning, Michael S. Steinbach, Anuj Karpatne, and Vipin Kumar. 2017. Introduction to Data Mining. 2nd Edition. Pearson. https://www-users.cs.umn.edu/~kumar001/dmbook.
Tan, Pang-Ning, Michael S. Steinbach, and Vipin Kumar. 2005. Introduction to Data Mining. 1st Edition. Addison-Wesley. https://www-users.cs.umn.edu/~kumar001/dmbook/firsted.php.
Tay, J. Kenneth, Balasubramanian Narasimhan, and Trevor Hastie. 2023. “Elastic Net Regularization Paths for All Generalized Linear Models.” Journal of Statistical Software 106 (1): 1–31. https://doi.org/10.18637/jss.v106.i01.
Therneau, Terry, and Beth Atkinson. 2023. Rpart: Recursive Partitioning and Regression Trees. https://github.com/bethatkinson/rpart.
Tillé, Yves, and Alina Matei. 2023. Sampling: Survey Sampling. https://CRAN.R-project.org/package=sampling.
Venables, W. N., and B. D. Ripley. 2002a. Modern Applied Statistics with s. Fourth. New York: Springer. https://www.stats.ox.ac.uk/pub/MASS4/.
———. 2002b. Modern Applied Statistics with s. Fourth. New York: Springer. https://www.stats.ox.ac.uk/pub/MASS4/.
Venables, W. N., D. M. Smith, and the R Core Team. 2021. An Introduction to R.
Weihs, Claus, Uwe Ligges, Karsten Luebke, and Nils Raabe. 2005. “klaR Analyzing German Business Cycles.” In Data Analysis and Decision Support, edited by D. Baier, R. Decker, and L. Schmidt-Thieme, 335–43. Berlin: Springer-Verlag.
Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
———. 2023a. Forcats: Tools for Working with Categorical Variables (Factors). https://forcats.tidyverse.org/.
———. 2023b. Stringr: Simple, Consistent Wrappers for Common String Operations. https://stringr.tidyverse.org.
———. 2023c. Tidyverse: Easily Install and Load the Tidyverse. https://tidyverse.tidyverse.org.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 2st ed. O’Reilly Media, Inc. https://r4ds.hadley.nz/.
Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, Dewey Dunnington, and Teun van den Brand. 2024. Ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. https://ggplot2.tidyverse.org.
Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. Dplyr: A Grammar of Data Manipulation. https://dplyr.tidyverse.org.
Wickham, Hadley, and Lionel Henry. 2023. Purrr: Functional Programming Tools. https://purrr.tidyverse.org/.
Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2024. Readr: Read Rectangular Text Data. https://readr.tidyverse.org.
Wickham, Hadley, Thomas Lin Pedersen, and Dana Seidel. 2023. Scales: Scale Functions for Visualization. https://scales.r-lib.org.
Wickham, Hadley, Davis Vaughan, and Maximilian Girlich. 2024. Tidyr: Tidy Messy Data. https://tidyr.tidyverse.org.
Wilkinson, Leland. 2005. The Grammar of Graphics (Statistics and Computing). Berlin, Heidelberg: Springer-Verlag. https://doi.org/10.1007/0-387-28695-0.
Witten, Ian H., and Eibe Frank. 2005. Data Mining: Practical Machine Learning Tools and Techniques. 2nd ed. San Francisco: Morgan Kaufmann.
Yu, Guangchuang. 2024. Scatterpie: Scatter Pie Plot. https://CRAN.R-project.org/package=scatterpie.
Zaki, Mohammed J. 2000. “Sequence Mining in Categorical Domains: Incorporating Constraints.” In Proceedings of the Ninth International Conference on Information and Knowledge Management, 422–29. CIKM ’00. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/354756.354849.
Zeileis, Achim, Torsten Hothorn, and Kurt Hornik. 2008. “Model-Based Recursive Partitioning.” Journal of Computational and Graphical Statistics 17 (2): 492–514. https://doi.org/10.1198/106186008X319331.