corpora_files: Vector containing the pathes to two corpus files (e.g. Scopus exports). The CSV files should contain for each record at least Authors (comma separated), Publication Title, Publication Year, and References (semicolon separated). The inclusion of DOI (for date checking; see the retrieve_pubdates option) as well as Abstract, Author.Keywords, and Index.Keywords (for the in-depth identification of publications belonging to both corpora) are strongly recommended.
labels: Labels (i.e. names) given to the two corpora to be analyzed.
keywords: Keywords identifying the two corpora
retrieve_pubdates: Flag indicating whether to confirm publication dates by retrieving them (see get_date_from_doi)
clean_refs: Attempt to clean references and keep titles only. NOT RECOMMENDED, especially if build_graph should be used subsequently.
encoding: Character encoding used in the input files.
Returns
Returns a dataframe containing a bibliographic dataset usable by Diderot and including all references from both corpora.
## Not run:# Two corpora on individual-based modelling (IBM) and agent-based modelling (ABM)# were downloaded from Scopus. The structure of each corpus is as follows: tt<-read.csv("IBMmerged.csv", stringsAsFactors=FALSE) str(tt,strict.width="cut")### 'data.frame': 3184 obs. of 9 variables:### $ Authors : chr "Chen J., Marathe A., Marathe M." "Van Dijk D., Sl"..### $ Title : chr "Coevolution of epidemics, social networks, and in"..### $ Year : int 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 ...### $ DOI : chr "10.1007/978-3-642-12079-4_28" "10.1016/j.procs.20"..### $ Link : chr "http://www.scopus.com/inward/record.url?eid=2-s2."..### $ Abstract : chr "This research shows how a limited supply of antiv"..### $ Author.Keywords: chr "Antiviral; Behavioral economics; Epidemic; Microe"..### $ Index.Keywords : chr "Antiviral; Behavioral economics; Epidemic; Microe"..### $ References : chr "(2009) Centre Approves Restricted Retail Sale of "..# Define the name of corpora (labels) and specific keywords to identify relevant# publications (keys). labels<-c("IBM","ABM") keys<-c("individual-based model|individual based model","agent-based model|agent based model")# Build the IBM-ABM bibliographical dataset from Scopus exports db<-create_bibliography(corpora_files=c("IBMmerged.csv","ABMmerged.csv"), labels=labels, keywords=keys)### [1] "File IBMmerged.csv contains 3184 records"### [1] "File ABMmerged.csv contains 9641 records"# Processed output. Note the field name changes (for standardization with ISI Web # of Knowledge format) and addition of the "Corpus" field (with identification of# joint "IBM | ABM" publications based on keywords). str(db, strict.width="cut")### 'data.frame': 12504 obs. of 10 variables:### $ Authors : chr "Chen J., Marathe A., Marathe M." "Van Dijk D., Sloot "..### $ Cite Me As : chr "Coevolution of epidemics, social networks, and indivi"..### $ Year : int 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 ...### $ DOI : chr "10.1007/978-3-642-12079-4_28" "10.1016/j.procs.2010.0"..### $ Link : chr "http://www.scopus.com/inward/record.url?eid=2-s2.0-78"..### $ Abstract : chr "This research shows how a limited supply of antiviral"..### $ Author.Keywords : chr "Antiviral; Behavioral economics; Epidemic; Microecono"..### $ Index.Keywords : chr "Antiviral; Behavioral economics; Epidemic; Microecono"..### $ Cited References: chr "(2009) Centre Approves Restricted Retail Sale of Tami"..### $ Corpus : chr "IBM" "IBM | ABM" "IBM | ABM" "IBM" ...## End(Not run)