This package brings together some functions I have created to perform repetitive tasks.
NB: this README file is a work in progress, for more info on functions please see the specific help files once the package is installed
Installation
Install the latest release with the following code:
install.packages('devtools')
devtools::install_github("EstherHerbert/Useful.functions@*release")
library(Useful.functions)
If you’d like to install the latest development version then you can omit @*release
from the call to install_github()
.
Functions
Working with data from CTRU database
wrangle_lookups()
When the lookups file has been provided from an export of the Edith database specification (as opposed to the export from prospect) this function wrangles the lookups and fields data frames into a format usable by read_prospect()
and apply_labels()
.
read_prospect()
This function reads in data from the prospect database. It takes the file path to the exported csv file and the data dictionary (typically called ‘lookups.csv’); applies the factor labels and does some basic formatting.
NB: the lookups file can be provided in a couple of different formats, currently read_prospect()
only works with one of these. See wrangle_lookups()
if you need to convert to this format.
apply_labels()
This function applies the field labels to the column of a data frame. This is particularly useful after creating a summary table (e.g. with discrete_table()
, see below).
Summary tables
continuous_table()
Creates a summary table of specified continuous variables in a data frame. A grouping variable can be provided with an optional total column too (see totals()
), as well as other options.
discrete_table()
Creates a summary table of specified discrete (factor) variables in a data frame. A grouping variable can be provided with an optional total column too (see totals()
), as well as other options.
missing_table()
Creates a table summarising whether specified variables are present/missing in a data frame. A grouping variable can be provided with an optional total column too (see totals()
), as well as other options.
ticked_table()
Creates a summary table of specified ‘ticked’ (binary) variables in a data frame. A grouping variable can be provided with an optional total column too (see totals()
), as well as other options.
NB: at the moment this function summarises variables which are "Ticked"
, ""
, or NA
.
ae_table()
Creates a summary table of numbers of events and number and percentage of individuals for adverse event data (although could be used with other event data). A grouping variable can be provided with an optional total column too (see totals()
), as well as other options.
NB: at the moment you need to create a variable beforehand (e.g., called “All AEs”) to summarise a total count of AEs.
Functions for use with ‘xtable’
Typically I write my reports in R Markdown (exporting to pdf) and I like to use the ‘xtable’ package to convert my tables to LaTeX. However, I have found that occasionally ‘xtable’ doesn’t have as much versatility as I’d like. These functions assist with that.
add_clines()
Produces a list with ‘pos’ and ‘command’ to use with print.xtable()
’s add.to.row option. This will create a partial horizontal line across specified columns of the table. Users can either specify the row numbers they’d like to add the \cline
to or the function will look for the rows which use multirow in a specified column.
add_multirow()
Turns duplicate rows (within a variable) which occur consecutively into NAs and adds multirow to the remaining rows.
longtable_head()
For use with print.xtable()
’s add.to.row
when tabular.environment = "longtable"
. The function Formats header information from an xtable for adding to the printed output at position 0 so that it is repeated when the table crosses to the next page.
makerow()
Collapses a vector into a character string separated by “&”, helpful for the add.to.row
option in xtable()
.
merge_colnames()
For use with print.xtable()
’s add.to.row
when column names are repeated and should be merged. The function removes consecutive, duplicate column names and uses LaTeX’s multicolumn
to merge the cells.
Other functions
package_info()
Gets the current loaded packages and their versions. I find this useful when writing my script header.
round0()
Rounding numbers whilst keeping trailing zeros, it’s a short cut for formatC(x, digits, format = "f")
.
row_to_colnames()
Takes a row of data and uses it to replace the column names. Default is to use the first row as the column names but you can specify another row.
colnames_to_row()
Moving the column names of a data frame to a row is sometimes useful when formatting a table for export. This function does that and the user can specify which row they’d like the column names to be inserted before.
search_list()
Takes a list of data frames and searches their variables for either a specific variable name or for a string within the variable names.
split_colnames()
Splits column names into two rows. The first row are the new column names, the second are the first row of the data.
stata_expand()
Recreates the expand function in Stata. Takes a data frame and duplicates it n times, then creates a variable (default name is Duplicate) which has a value of 0 if he observation originally appeared in the dataset and i = 1,..,n for each duplicate.
totals()
This function uses stata_expand()
and dplyr::mutate()
to give a data frame that, when summarised, will give total rows and/or columns.
unicol
Not technically a function, this is a data set in the package which has HEX colour specifications for the university colour palette. There are two functions display_unicol()
and get_unicol()
which display the colours and get colour information.
Addins
Addins are functions with are used interactively within RStudio. There are three Addins provided in this package and keyboard short cuts can be created for them to speed up coding. More information on Addins can be found, including how to set keyboard short cuts, here.
Insert Section
This Addin inserts dashes from the cursor location to ‘end’ of the row (width obtained with getOptions
).
R Markdown template
To save time when writing statistics reports I have created an R Markdown template. Currently the report only exports to pdf (my personal preference) but it could be modified to export to word.
You can access the template either by creating a new R Markdown document from the menu in RStudio and then looking through the templates or using rmarkdown::draft("my-report.Rmd", template = "statistics-report", package = "Useful.functions")
. Both of these options will create a folder with your report name containing the markdown file and the templates required to generate the report. I recommend you render it to pdf using rmarkdown::render
rather than using the knit button.
*NB: I plan to create other templates in the future - e.g., for status reports or presentations.