perform full text search on tweets collected with epitweetr
perform full text search on tweets collected with epitweetr
perform full text search on tweets collected with epitweetr (tweets migrated from epitweetr v<1.0.x are also included)
search_tweets( query =NULL, topic =NULL, from =NULL, to =NULL, countries =NULL, mentioning =NULL, users =NULL, hide_users =FALSE, action =NULL, max =100, by_relevance =FALSE)
Arguments
query: character. The query to be used if a text it will match the tweet text. To see how to match particular fields please see details, default:NULL
topic: character, Vector of topics to include on the search default:NULL
from: an object which can be converted to ‘"POSIXlt"’ only tweets posted after or on this date will be included, default:NULL
to: an object which can be converted to ‘"POSIXlt"’ only tweets posted before or on this date will be included, default:NULL
countries: character or numeric, the position or name of epitweetr regions to be included on the query, default:NULL
mentioning: character, limit the search to the tweets mentioning the given users, default:NULL
users: character, limit the search to the tweets created by the provided users, default:NULL
hide_users: logical, whether to hide user names on output replacing them by the USER keyword, default:FALSE
action: character, an action to be performed on the search results respecting the max parameter. Possible values are 'delete' or 'anonymise' , default:NULL
max: integer, maximum number of tweets to be included on the search, default:100
by_relevance: logical, whether to sort the results by relevance of the matching query or by indexing order, default:FALSE If not provided the system will try to reuse the existing one from last session call of setup_config or use the EPI_HOME environment variable, default: NA
Returns
a data frame containing the tweets matching the selected filters, the data frame contains the following columns: linked_user_location, linked_user_name, linked_user_description, screen_name, created_date, is_geo_located, user_location_loc, is_retweet, text, text_loc, user_id, hash, user_description, linked_lang, linked_screen_name, user_location, totalCount, created_at, topic_tweet_id, topic, lang, user_name, linked_text, tweet_id, linked_text_loc, hashtags, user_description_loc
So other than the provided parameters, multi field queries are supported by using the syntax field_name:value1;value2 AND, OR and -(for excluding terms) are supported on q parameter. Order by week is always applied before relevance so even if you provide by_relevance = TRUE all of the matching tweets of the first week will be returned first
Examples
if(FALSE){#Running the detect loop library(epitweetr) message('Please choose the epitweetr data directory') setup_config(file.choose()) df <- search_tweets( q ="vaccination", topic="COVID-19", countries=c("Chile","Australia","France"), from = Sys.Date(), to = Sys.Date()) df$text
}