Turn a programing project into a Corpus and get message about files and functions
explor_project.Rd
This function craft a corpus, according to the default settings. It will return the corpus with a citations network of internal dependancies and print some message
Arguments
- folders
character
A string or list representing the path(s) of local folders path to read.- repos
character
A string or list representing the name(s) of github repos (e.g., 'tidyverse/stringr').- languages
character
. Default ="R"
. A character vector specifying the programming language(s) to include in the corpus.- head
integer
. Default =5
. Number of lines to print.- ...
Parameters passed to
construct_corpus()
. These parameters arecharacters values, in order to add a prefix and a suffix to the pattern searched (e.g.,
suffix_for_2nd_matches
) or changing the colnames (e.g.,file_path_from_colname
).double values, e.g.,
n_char_to_add_suffix
parameter (minimum number of characters to add the suffix).logical values, e.g.,
filter_egolink_within_a_file
(default =TRUE
) andexclude_quoted_content
from the research (default =FALSE
)
Value
A list
of 5 dataframe
: 2 of class corpus.lines
, 2 corpus.nodelist
and 1 citations.network
(symbolizing the edgelist of a document-to-document citations network within a programming project)
from
character
citations.network - The local file path or GitHub URL that call a function.to
character
citations.network - The local file path or constructed GitHub URL where the function called is defined.function
character
citations.network - The name of the function matched on a line.content_matched
character
citations.network - The full content matched with the 2nd matches, in order to verify and craft a new regex.line_number
character
citations.network & corpus.lines - The line number of the 2nd match (citation.network) or associated with a line (corpus.lines).file_path
character
corpus.lines & corpus.nodelist - The local file path or constructed GitHub URL, same values as thefrom
&to
columns of the citations.network df.content
character
corpus.lines - The content from a line.matches
character
corpus.lines (specifically thecodes
data.frame)The matched text during the 1st matches (full of
NA
if there is no match or if they are filtered out, the default).