get_data function

Retrieve Data from the Database

Retrieve Data from the Database

This is the main function of the package to retrieve data from the database. It constructs an SQL query which is sent to the database and returns the data as a data.table in R.

get_data( dsid = NULL, series = NULL, from = NULL, to = NULL, labels = TRUE, wide = TRUE, expand.date = FALSE, ordered = TRUE, return.query = FALSE, ... )

Arguments

  • dsid: character. (Optional) id's of datasets matching the 'DSID' column of the 'DATASET' table (retrieved using datasets()). If none of the following arguments are used, all series from those datasets will be returned.
  • series: character. (Optional) codes of series matching the 'Series' column of the 'Series' table (retrieved using series()).
  • from: set the start time of the data retrieved by either supplying a start date, a date-string of the form "YYYY-MM-DD" or "YYYY-MM", year-quarters of the form "YYYYQN" or "YYYY-QN", a numeric year YYYY (numeric or character), or a fiscal year of the form "YYYY/YY". These expressions are converted to a regular date by make_date.
  • to: same as from: to set the time period until which data is retrieved. For expressions that are not full "YYYY-MM-DD" dates, the last day of the period is chosen.
  • labels: logical. TRUE will also return labels (series descriptions) along with the series codes.
  • wide: logical. TRUE calls long2wide on the result. FALSE returns the data in a long format without missing values (suitable for ggplot2).
  • expand.date: logical. TRUE will call expand_date on the result.
  • ordered: logical. TRUE orders the result by 'Date' and, if labels = TRUE, by series, maintaining the column-order of series in the dataset(s). FALSE returns the result in a random order, to the benefit of faster query execution.
  • return.query: logical. TRUE will not query the database but instead return the constructed SQL query as a character string.
  • ...: further arguments passed to long2wide (if wide = TRUE) or expand_date (if expand.date = TRUE), no conflicts between these two.

Returns

A data.table with the result of the query.

Details

If labels = FALSE, the 'SERIES' table is not joined to the 'DATA' table, and ordered = TRUE will order datasets and series retrieved in alphabetic order. If labels = TRUE data is ordered by series and date within each dataset, preserving the order of columns in the dataset. If multiple datasets are received they are ordered alphabetically according to the 'DSID' column.

It is possible query multiple series from multiple datasets e.g. get_data(c("DSID1", "DSID2"), c("SERFROM1", "SERFROM2")) etc., but care needs to be taken that the series queried do not occur in both datasets (see .IDvars, and check using series(c("DSID1", "DSID2"))). Series from datasets at different frequencies can be queried, but, if wide = TRUE, this will result in missing values for all but the first observations per period in the lower frequency series.

Examples

# Return monthly macroeconomic indicators from the year 2000 onwards get_data("BOU_MMI", from = 2000, wide = FALSE) # Return wide format with date expanded get_data("BOU_MMI", from = 2000, expand.date = TRUE) # Same thing in multiple steps (with additional customization options): library(magrittr) # Pipe %>% operators get_data("BOU_MMI", from = 2000, wide = FALSE) %>% long2wide %>% expand_date # Getting a single series get_data("BOU_MMI", "M2", 2000) # Getting High-Frequency activity indicators from BoU and Revenue & Expense from MoFPED get_data(c("BOU_MMI", "MOF_TOT", "WB_WDI"), c("CIEA", "BTI", "REV_GRA", "EXP_LEN")) # Getting daily interest rates and plotting library(xts) # Time series class get_data("BOU_I", from = 2018, wide = FALSE) %>% long2wide(names_from = "Label") %>% as.xts %>% plot(legend.loc = "topleft")

See Also

long2wide, expand_date, ugatsdb