Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for the regex-split crate to add split_inclusive functionality #2

Open
kpdowney opened this issue Jul 4, 2024 · 3 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@kpdowney
Copy link

kpdowney commented Jul 4, 2024

In the python split if a capture group is used the split returns the split character in addition to the delimited tokens. The default rust regex crate does not do this. I believe this is used pretty extensively in python and would be a great add IMO.

@itsmeadarsh2008 itsmeadarsh2008 added enhancement New feature or request good first issue Good for newcomers labels Jul 5, 2024
@itsmeadarsh2008
Copy link
Owner

itsmeadarsh2008 commented Jul 5, 2024

@kpdowney
Can you give me an example of how you would do that native re-module in Python (no flpc)? (inclusive of a test regex and valid strings to test against, So I check my code according to the output)
How exactly do you want your code to be structured? consistent naming system. The underscore in split_inclusive functionality kills the point of being an analogical library.

@KevinPD66
Copy link

Good morning and sorry for the delay. Super simple example with the standard re module. Here I have split on a space, a period and an exclamation mark. With a capture group if brings back both the words as well as what I split on. This is the functionality I was suggesting as a consideration. The naming I mentioned is not important - whatever you feel is best I would be very happy with. Thanks for your help.

text = "The fox jumps over the dog. Poor dog!"
re.split(r'(\s|.|!)', text)

Output:
['The',
' ',
'fox',
' ',
'jumps',
' ',
'over',
' ',
'the',
' ',
'dog',
'.',
'',
' ',
'Poor',
' ',
'dog',
'!',
'']

@itsmeadarsh2008
Copy link
Owner

I'm unsure about the issue, but adding a separate dependency would make it unnecessarily bloated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants