docx_summary function

Get Word content in a data.frame

Get Word content in a data.frame

read content of a Word document and return a data.frame representing the document.

docx_summary(x, preserve = FALSE, remove_fields = FALSE, detailed = FALSE)

Arguments

  • x: an rdocx object

  • preserve: If FALSE (default), text in table cells is collapsed into a single line. If TRUE, line breaks in table cells are preserved as a "\n" character. This feature is adapted from docxtractr::docx_extract_tbl()

    published under a MIT licensed in the {docxtractr} package by Bob Rudis.

  • remove_fields: if TRUE, prevent field codes from appearing in the returned data.frame.

  • detailed: Should information on runs be included in summary dataframe? Defaults to FALSE. If TRUE a list column run is added to the summary containing a summary of formatting properties of runs as a dataframe with rows corresponding to a single run and columns containing the information on formatting properties.

Note

Documents included with body_add_docx() will not be accessible in the results.

Examples

example_docx <- system.file( package = "officer", "doc_examples/example.docx" ) doc <- read_docx(example_docx) docx_summary(doc) docx_summary(doc, preserve = TRUE)[28, ]