tidyfeed function

Extract a tidy data frame from RSS, Atom and JSON feeds

Extract a tidy data frame from RSS, Atom and JSON feeds

tidyfeed() downloads and parses rss feeds. The function produces either a tidy data frame or a named list, easy to use for further manipulation and analysis.

tidyfeed( feed, config = list(), clean_tags = TRUE, list = FALSE, parse_dates = TRUE )

Arguments

  • feed: character, the url for the feed that you want to parse, e.g. "http://journal.r-project.org/rss.atom".
  • config: Arguments passed off to httr::GET().
  • clean_tags: logical, default TRUE. Cleans columns of HTML tags.
  • list: logical, default FALSE. Return metadata and content as separate dataframes in a named list.
  • parse_dates: logical, default TRUE. If TRUE, tidyRSS will attempt to parse columns that contain datetime values, although this may fail, see note.

Note

tidyfeed() attempts to parse columns that should contain dates. This can fail, as can be seen here. If you need lower-level control over the parsing of dates, it's better to leave parse_dates equal to FALSE and then parse these yourself.

Examples

## Not run: # Atom feed: tidyfeed("http://journal.r-project.org/rss.atom") # rss/xml: tidyfeed("http://fivethirtyeight.com/all/feed") # jsonfeed: tidyfeed("https://daringfireball.net/feeds/json") ## End(Not run)

References

https://en.wikipedia.org/wiki/RSS

See Also

GET()

Author(s)

Robert Myles McDonnell, robertmylesmcdonnell@gmail.com

  • Maintainer: Robert Myles McDonnell
  • License: MIT + file LICENSE
  • Last published: 2023-03-05