[OFF-TOPIC] - Regex - strip leading non-numeric chars

Does anyone know the pure witchcraft regex expresion to remove all non-numeric characters from the beginning of a string?

For example, in each case I want to be left with 1234567 :
*&%&%^*hjghgfgyh u gyu1234567
+1234567
+001234567
hhrrttdghdhjfhfgrjdgfdhjfdkfjfihukjhr nuhh frelheruhehurhrxffr1234567

This is not for Python specifically (it’s actually for Lua) but I’m pretty sure the regex expression would be the same regardless.

I just cannot find the right words to google this.

In my defence, I am just getting over Covid.

I have this lovely cheatsheet bookmarked because I can never remember regex expressions: Regular Expressions Cheat Sheet by DaveChild - Download free from Cheatography - Cheatography.com: Cheat Sheets For Every Occasion

I also usually use a regex checker to test what I’m trying to do: https://regex101.com/ (but maybe that’s obvious).

To be clear, in your case, are you only trying to remove the non-numeric characters at the beginning? So if you had hello1234567world, do you want to be left with 1234567world?

(I have sympathy for you, I just test positive this morning :upside_down_face: )

1 Like

Correct.

And I have sympathy for you. I hope you recover faster than I did!

Could you then match a digit one or more times followed by any character zero or more times? re.match(r'\d+.*', string)

regex is easy to learn. I’ve learned it at least a hundred times.

8 Likes

On the server side:
re.sub(r'^[^\d]*', '', txt)

  • [abc] matches any character inside, either a, b or c
  • [^abc] matches any character that is not inside, all but a, b or c
  • \d matches any numerical character
  • [^\d] matches any non numerical character
  • [^\d]* matches a repetition of zero or more non numerical character
  • ^[^\d]* matches zero or more non numerical character starting from the beginning of the string (yes, the ^ has a meaning if it’s the first character in the pattern, another meaning if it’s the first inside [])

On the client side re.sub does not exist, so you can use re_sub, which behaves slightly differently from the real re.sub:
re_sub(r'^[^\d]*(.*)', r'\1', txt)

2 Likes

I’ll try that. This also works in regex fiddles to match the leading non digits :

^\D*

but not in my code. I must be doing something wrong.

Thank you all, I will try all the examples on here.