Compute a Citations Network of functions form a corpus_list dataframe
add_doc_network_to_corpus.Rd
This function read a standard list
of data.frame
(class corpus.list
)
select the functions
data.frame
and append a Citations Network to the corpus
(see hereafter). The classes of this new entry are data.frame
citations.network
Usage
add_doc_network_to_corpus(
corpus,
matches_colname = "name",
content_colname = "code",
prefix_for_2nd_matches = "(?<!\\w)",
suffix_for_2nd_matches = "(\\b|\\()",
filter_egolink_within_a_file = TRUE,
exclude_quoted_content = F
)
Arguments
- corpus
character
Acorpus.list
object from the construct_corpus function- matches_colname
character
, default ='name'
The name of the column of thefunctions
df that will be used for construct a regex.- content_colname
character
, default ='code'
The name of the column of thefunctions
df that will be used for search a match and extract text.- prefix_for_2nd_matches
character
A string representing the prefix to add to each 1st match that will be turned into a new regular expressions. The default is an empty string.- suffix_for_2nd_matches
character
A string representing a regex to add as a suffix of each match, in order to have a complete regular expression. The default is an empty string.- filter_egolink_within_a_file
logical
, default =TRUE
. A logical value indicating whether to filter results based on "ego links" (a document referring to itself)- exclude_quoted_content
logical
, default =FALSE
. A logical value indicating if the quoted content should be take into consideration. If set toTRUE
, text within " or ' over the same line will be suppressed, before to realize the matches
Value
A list
of data.frame
, with a supplementary df that is the edgelist of a document-to-document citations network
from
character
Citations Network - The local file path or GitHub URL that call a function.to
character
Citations Network - The local file path or constructed GitHub URL where the function called is defined.file_path
character
Corpus - The local file path or constructed GitHub URL.file_path_from
character
The file_path of the 'from' function.function_order
character
The order of citation within a text entry.n_fn_call
integer
The number of call of a function defined in another file.
Details
It is designed to generate a network of text by cascading text research,
assuming the 1st matches are already realized by construct_corpus
:
The function will craft a pattern by appending all the 1st matches (matches
column elements)
, adding a prefix to these elements and maybe a suffix (depending on the number of characters).
Then it will perform a direct extraction with this pattern,
and return the corpus with a new data.frame
of class citations.network
(document with the 2nd match => document with the 1st match).
By default, egolinks are removed since filter_egolink_within_a_file
default is TRUE
Examples
{
# Example with local folder path
corpus <- construct_corpus(folders = "~", languages = "R")
corpus <- add_doc_network_to_corpus(corpus)
# Return a list of df (1st one is supposed to be an edgelist)
# (from the file where a function is call => to the file were defined)
}
#> Error in sub(re, "", x, perl = TRUE): input string 8 is invalid UTF-8