Try to match a text pattern in a df column by only extract the text
srch_pattern_in_df.RdRead some files and answer the content readed in a df. Then try to extract a pattern and return the extracted text in a column of the returned df (NA meaning 'no match').
Usage
srch_pattern_in_df(
df,
content_col_name = "content",
pattern = "(^| \\.|\\b)([\\.A-Za-z0-9_]+)(?=\\s*(?:<-)\\s*function)",
match_to_exclude = NULL,
ignore_match_less_than_nchar = 3,
extracted_txt_col_name = "matches",
duplicated_lines_are_normal = F
)Arguments
- df
data.frameA data.frame with a minima acharactercolumn.- content_col_name
character, default ="content"Name of the text column in the input df (will be returned in the output df).- pattern
character, default ="\\b([A-Za-z0-9_]+)(?=\\s*(?:<-|=)\\s*(?:function|$))"A regex for matching lines and extract text.- match_to_exclude
characterA vector of values that will not be returned such as a match. The rows where thevaluesmatch any element in this vector will be removed.- ignore_match_less_than_nchar
double, default = 2 Excluding match depending on char. number of the matched text (strictly inferior) Default exclude match of 1 char such as 'x'.- extracted_txt_col_name
character, default ="matches"Column name for the extracted text (last col' of the returned df)- duplicated_lines_are_normal
logical, default =FALSE. If set toTRUE, silent the warning about duplicated lines
Value
A data.frame similar to the one passed by the user with 1 more column : the match ; a minima :
contentcharacterThe text column designed by the user.matchcharacterThe matched text on this line,NAif there is no match.