Citations Network of Internal Dependencies computed by codexplor
vignette_citations.network_df_of_internal.dependencies.Rmd
This Vignette explain the data.frame
with additional
classes citations.network
and
internal.dependencies
. This type of data.frame
is the edgelist of a Citations Network of functions or files within
programming project(s).
Overall Considerations : about the citations.network
data.frame
codexplor::construct_corpus
protocols add a
citations.network
class of data.frame
to the
main corpus.list
.
This citations.network
data.frame
is an
edgelist1
that symbolize the programming project as a document-to-document
citations network : a network of documents, citing each others.
codexplor
offers several levels-of-depth, e.g., the
nodes of a citations.network
data.frame
could
be the functions or the programming files of the corpus.
A citations network have theoretical network properties, not covered here (e.g., a citations network is a directed network).
Network properties
Hereafter, some of the properties of 2 citations.network
df of the corpus.list
: the files.network
and
functions.network
df.
Name | files.network |
functions.network |
Nodes | Programming files | Functions |
Links | Start from a programming file that calls a function and point to the programming file where the function is defined | Start from a function that call another function and point to this 2nd function |
Indegrees | °← How many files of the project(s) are calling the function(s) defined in this file | °← How many functions of the project(s) are calling this function |
Outdegrees | °→ Number of files of the project(s) where the functions used by this file are defined | °→ How many functions of the project(s) are used by this function |
Automentioning | (codexplor default is to filter out
autolinks within files) |
Recursivity |
See a dataviz’ of a
files.network
in the short example of documents network
Technical details
Cascading matching.
codexplor::construct_corpus
perform a cascading matching
:
- The names of the exposed functions names are extracted from files where these functions are defined, e.g., have found ‘Pier’, ‘pol’ and ‘jak’ function names in 3 separate files.
- The functions that calls these functions are identified, e.g., a
function is calling
'Pier('
,'pol('
and'jak('
and the function is linked to these 3 functions. - Similarly, a network of files is constructed, e.g., the code of a
programming file is calling
'Pier('
,'pol('
and'jak('
and this file is linked to the files where these functions are defined.
Function call that are non-supported during 2nd matches
Some way of writing a program are not fully supported during the 2nd matches.
Python instantiated methods. If we take this Python
code sample hereafter, the 1st match is realized by
construct_corpus
, but codexplor
can’t match
the instantiated Python methods, i.e. obj.method()
.
Filter undesired match
Junk nodes needs to be excluded from the returned networks, such as
the lines where you use the name of another function in a message,
e.g. message("my_func() is deprecated. You should use this_function() instead")
.
construct_corpus
offer the
exclude_quoted_content
parameter, in order to exclude
quoted text from the matches and avoid this false positive case.
construct_corpus("~", languages = "R", exclude_quoted_content = TRUE)
Be aware that all the quoted text will be avoided during the search
for functions names, if you’ve included some - fancy - way of
call a function as a quoted character
chain.