Software Research Scientst
developer, open data advocate, cyclist
mostly C++, R packages
I am a Software Research Scientist for rOpenSci, an orginisation dedicated to "Transforming science through open data and software," and for which I am helping to develop a new system for peer reviewing statistical software. I also develop R packages for accessing and analysing open spatial data, with a particular focus on urban planning and transport. I am a founding member of Active Transport Futures, and the lead developer of moveability.city, a platform for open-source, open-access interactive visualisations of urban moveability.
25 Sep 20
A primer on ways to extract the actual content of help files. Because one day people will hopefully start text-mining these things, and show us all sorts of things we never knew about the people who make R packages. When they do, this entry will hopefully help.
07 Nov 19
This article was recently published in the Rcpp Gallery, and demonstrates using the RcppParallel package to aggregate to an output vector. It extends directly from previous demonstrations of single-valued aggregation, through providing necessary details to enable aggregation to a vector, or by extension, to any arbitrary form.
25 Oct 19
Activating github two-factor authentication (2FA) offers an indubitable security boost, with one notable side effect--https authentication requires entering a Personal Access Token instead of password. This entry explains how I reconfigured my git push commands with 2FA to be able to enter my password once again, instead of a random 32-character token.
04 Jul 19
I recently encountered a problem while bundling an old C library into a new R package. The library itself depends on, and includes, an external "dictionary" in plain text format used to construct a large lookup table. The creators of this library of course assume that this dictionary file will always reside in the same directory as the compiled object, and so can always be directly linked. The `src` directory of R packages is, however, only permitted to contain source code, which text files definitively are not. This blog entry is about where to put such files, and how to link them within the source code.
06 Jun 19
Caching is implemented because it saves time, generally by saving the results of one function call for subsequent reuse. Background processes are also commonly implemented as time-saving measures, through delegating long-running tasks to "somewhere else", allowing you to keep focussing on whatever (un)important things you were doing in the meantime. This blog entry describes how to combine the two to save double time through caching via background processes.
Copyright © 2019--20 mark padgham