Skip to contents

This function returns a df with informations for each language (each row), e.g., file extensions associated and regex pattern for matching commented lines and functions definitions) for one or more specified programming languages.

Usage

get_def_regex_by_language(languages = NULL, ..., return_examples = F)

Arguments

languages

character, default = "R" A programming language to match (character strings).

...

character One or more programming language names (character strings).

return_examples

logical Return a list (thesaurus) that formalize how-to assign a function, for the languages supported by construct_corpus.

Value

A list where each entry corresponds to a language and contains:

language

character The name of the language on a row, e.g., 'R'.

fn_regex

character A regex dedicated to catch the function names, as soon as a function is defined within a file.

file_extension

character The typical programming file extension, e.g., ".R" for the R language.

commented_line_char

character The pattern symbolizing a commented line, i.e. content is commented after that pattern.

delim_pair_comments_block

character list indicating the - open and close - characters that symbolizes a multi-lines comment, in addition to the commented_line_char one-liner syntax.

delim_pair_nested_codes

character list indicating the - open and close - characters that symbolizes a multi-lines block of code.

pattern_to_exclude

character The pattern of typical programming files to exclude from the analyses, e.g., "\\.Rcheck|test-|vignettes" for the R language.} \item{\code{local_file_ext}}{characterThe typical programming file extension turned into a regex, by pasting"$"to the end offile_extension` value.

Details

This function supports multiple languages in a single call. Language names are case-insensitive.

Examples

fn_def <- get_def_regex_by_language("Python", "R" , "JavaScript")
#> Warning: Unsupported language(s) (javascript) !
#>  Available languages => R, Python
names(fn_def) ; str(fn_def[[1]])
#> [1] "R"      "Python"
#> List of 8
#>  $ fn_regex                   : chr "(^|\\.|\\b)(?!FUN|error)[a-zA-Z_\\.][a-zA-Z0-9_\\.]*\\s*(?=(?:<-|=)\\s*(?:function\\())"
#>  $ file_extension             : chr ".R"
#>  $ commented_line_char        : chr "#"
#>  $ delim_pair_comments_block  : logi NA
#>  $ pattern_to_exclude         : chr "\\.Rcheck|test-|vignettes|/doc/"
#>  $ escaping_char              : chr "\\"
#>  $ fn_regex_params_after_names: chr "\\s*(<-|=)\\s*function\\("
#>  $ delimited_fn_codes         : logi TRUE