trim: logical. If TRUE removes leading and trailing white spaces.
clean: trim logical. If TRUE extra white spaces and escaped character will be removed.
pattern: A character string containing a regular expression (or character string for fixed = TRUE) to be matched in the given character vector (see Details for additional information). Default, @rm_citation uses the rm_citation regex from the regular expression dictionary from the dictionary argument.
replacement: Replacement for matched pattern.
extract: logical. If TRUE the dates are extracted into a list of vectors.
dictionary: A dictionary of canned regular expressions to search within if pattern begins with "@rm_".
...: Ignored.
x: The output from ex_citation.
Returns
Returns a character string with citations removed.
Returns a data.frame of Authors, Years, and n (counts).
Details
The default regular expression used by rm_citation finds in-text and parenthetical citations. This behavior can be altered by using a secondary regular expression from the regex_usa
data (or other dictionary) via (pattern = "@rm_citation2" or pattern = "@rm_citation3"). See Examples for example usage.
Note
This function is experimental.
Examples
## All Citationsx <- c("Hello World (V. Raptor, 1986) bye","Narcissism is not dead (Rinker, 2014)","The R Core Team (2014) has many members.", paste("Bunn (2005) said, \"As for elegance, R is refined, tasteful, and","beautiful. When I grow up, I want to marry R.\""),"It is wrong to blame ANY tool for our own shortcomings (Baer, 2005).","Wickham's (in press) Tidy Data should be out soon.","Rinker's (n.d.) dissertation not so much.","I always consult xkcd comics for guidance (Foo, 2012; Bar, 2014).","Uwe Ligges (2007) says, \"RAM is cheap and thinking hurts\"")rm_citation(x)ex_citation(x)as_count(ex_citation(x))rm_citation(x, replacement="[CITATION HERE]")## Not run:qdapTools::vect2df(sort(table(unlist(rm_citation(x, extract=TRUE)))),"citation","count")## End(Not run)## In-Textex_citation(x, pattern="@rm_citation2")## Parentheticalex_citation(x, pattern="@rm_citation3")## Not run:## Mining Citationif(!require("pacman")) install.packages("pacman")pacman::p_load(qdap, qdapTools, dplyr, ggplot2)url_dl("http://umlreading.weebly.com/uploads/2/5/2/5/25253346/whole_language_timeline-updated.docx")parts <- read_docx("whole_language_timeline-updated.docx")%>% rm_non_ascii()%>% split_vector(split ="References", include =TRUE, regex=TRUE)parts[[1]]parts[[1]]%>% unbag()%>% ex_citation()%>% c()## Countsparts[[1]]%>% unbag()%>% ex_citation()%>% as_count()## By lineex_citation(parts[[1]])## Frequencycites <- parts[[1]]%>% unbag()%>% ex_citation()%>% c()%>% data_frame(citation=.)%>% count(citation)%>% arrange(n)%>% mutate(citation=factor(citation, levels=citation))## Distribution of citations (find locations and then plot)cite_locs <- do.call(rbind, lapply(cites[[1]],function(x){ m <- gregexpr(x, unbag(parts[[1]]), fixed=TRUE) data.frame( citation=x, start = m[[1]]-5, end = m[[1]]+5+ attributes(m[[1]])[["match.length"]])}))ggplot(cite_locs)+ geom_segment(aes(x=start, xend=end, y=citation, yend=citation), size=3, color="yellow")+ xlab("Duration")+ scale_x_continuous(expand = c(0,0), limits = c(0, nchar(unbag(parts[[1]]))+25))+ theme_grey()+ theme( panel.grid.major=element_line(color="grey20"), panel.grid.minor=element_line(color="grey20"), plot.background = element_rect(fill="black"), panel.background = element_rect(fill="black"), panel.border = element_rect(colour ="grey50", fill=NA, size=1), axis.text=element_text(color="grey50"), axis.title=element_text(color="grey50"))## End(Not run)