python - RE to handle both formats -

i have 2 type of files.

one contains line below:

"55.28 longurl0.20s: preplan async"

another contains line blow:

>55.28 longurl0.20s: preplan async</a></span><br></td>

in both cases, i'd content starting longurl , ending </a> or end of line.

>>> b="55.28 longurl0.20s: preplan async" >>> a=">55.28 longurl0.20s: preplan async</a></span><br></td>" >>> re.findall(r'longurl\d*.\d*s:[^<]+',a) ['longurl0.20s: preplan async'] >>> re.findall(r'longurl\d*.\d*.*$',b) ['longurl0.20s: preplan async']

can single re can cover both?

why don't longurl\d+[^<]+:

>>> import re >>> = ">55.28 longurl0.20s: preplan async</a></span><br></td>" >>> b = "55.28 longurl0.20s: preplan async" >>> re.findall(r'longurl\d+[^<]+', a) ['longurl0.20s: preplan async'] >>> re.findall(r'longurl\d+[^<]+', b) ['longurl0.20s: preplan async']

Search This Blog

Brazell

python - RE to handle both formats -

Comments

Post a Comment

Popular posts from this blog

How to remove text and logo OR add Overflow on Android ActionBar using AppCompat on API 8? -

html - How to style widget with post count different than without post count -

url rewriting - How to redirect a http POST with urlrewritefilter -