Extract regex matches from a string and return a tidy dataframe
str_extract_all_to_tidy_df.Rd
This function applies stringr::str_extract_all()
to a string, extracts regex matches,
and returns a 2 columns unested dataframe :
1st column is the matched text. Lines without match are filtered out.
2nd column is the corresponding position index
Option offered : customizable colnames
Usage
str_extract_all_to_tidy_df(
string,
pattern,
matches_colname = "matches",
row_number_colname = "row_number"
)
Arguments
- string
character
vector. A character vector containing the input text.- pattern
character
. A regex pattern to extract matches.- matches_colname
character
. A string specifying the column name for extracted matches (default:"matches"
).- row_number_colname
character
. A string specifying the column name for row numbers (default:"row_number"
).
Value
A dataframe with the extracted matches and their corresponding row numbers.
- matches
1st col' is the matched-text. Colname is indicated with the
matches_colname
parameter (default is 'matches')- row_number
2nd col is the position of the match within the vector. Colname is indicated with the
row_number_colname
parameter (default is 'row_number')