page_content function

Retrieves MediaWiki page content

Retrieves MediaWiki page content

wiki_page retrieves the DOM of a particular MediaWiki page, as a HTML blob inside a JSON object.

page_content( language = NULL, project = NULL, domain = NULL, page_name, page_id = NULL, as_wikitext = FALSE, clean_response = FALSE, ... )

Arguments

  • language: The language code of the project you wish to query, if appropriate.
  • project: The project you wish to query ("wikiquote"), if appropriate. Should be provided in conjunction with language.
  • domain: as an alternative to a language and project combination, you can also provide a domain ("rationalwiki.org") to the URL constructor, allowing for the querying of non-Wikimedia MediaWiki instances.
  • page_name: The title of the page you want to retrieve
  • page_id: the pageID of the page you want to retrieve. Set to NULL by default, and an alternative to page_name; if both are provided, page_id will be used.
  • as_wikitext: whether to retrieve the wikimarkup (TRUE) or the HTML (FALSE). Set to FALSE by default.
  • clean_response: whether to do some basic sanitising of the resulting data structure. Set to FALSE by default.
  • ...: further arguments to pass to httr's GET.

Examples

## Not run: #Content from a Wikimedia project wp_content <- page_content("en","wikipedia", page_name = "Aaron Halfaker") #Content by ID wp_content <- page_content("en", "wikipedia", page_id = 12) #Content from a non-Wikimedia project rw_content <- page_content(domain = "rationalwiki.org", page_name = "New Age") ## End(Not run)

See Also

revision_diff for retrieving 'diffs' between revisions, revision_content for retrieving the text of specified revisions.

  • Maintainer: Os Keyes
  • License: MIT + file LICENSE
  • Last published: 2024-04-05