regex - Get values from onclick attribute using python bs4 -


i unable parse through onclick attribute selected values. here onclick attribute

onclick="try{appendpropertyposition(this,'b10331465','9941951739','','dealer','murugan.n');jsb9onunloadtracking();jsevt.stopbubble(event);}catch(e){};" 

how selected values onclick attribute such (phonenumber , '', 'dealer','name'). here code.

from bs4 import beautifulsoup import urllib2 import re url="http://www.99acres.com/property-in-velachery-chennai-south-ffid?" page=urllib2.urlopen(url) soup = beautifulsoup(page.read()) properties = soup.findall('a', title=re.compile('bedroom')) eachproperty in properties:  print "http:/"+ eachproperty['href']+",", eachproperty.string, eachproperty['onclick'] 

update

i want 1 phone number, though there many, above mentioned onclick attribute.

for example, right getting

y10765227, 9884877926, 9283183326,, dealer, rgmuthu l10038779, 9551154555, ,, , r10831945, 9150000747, 9282109134, 9043728565, ,, , b10750123, 9952946340, , dealer, bala r10763559, 9841280752, 9884797013, , dealer, senthil 

this getting using following code

re.findall("'([a-za-z0-9,\s]*)'", (a['onclick'] if else '')) 

i trying modify in such way 1 phone number retrieved , rest should vanish. should this

    y10765227, 9884877926, dealer, rgmuthu     l10038779, 9551154555     r10831945, 9150000747     b10750123, 9952946340, dealer, bala     r10763559, 9841280752, dealer, senthil 

i trying use

re.findall("'([a-za-z0-9,\s]*)'", (re.sub(r'([^,]+,[^,]+,)(.*?)([a-za-z].*)', r'\1\0',a['onclick']) if else '')) 

but not seem work.

you can use regex getting data out of onclick:

properties = soup.findall('a', title=re.compile('bedroom')) eachproperty in properties:     print re.findall("'([a-za-z0-9,\s]*)'", eachproperty['onclick']) 

prints:

['y10765227', '9884877926, 9283183326', '', 'dealer', 'rgmuthu'] ['l10038779', '9551154555', ',', ','] ['r10831945', '9150000747, 9282109134, 9043728565', ',', ','] ['b10750123', '9952946340', '', 'dealer', 'bala'] ['r10763559', '9841280752, 9884797013', '', 'dealer', 'senthil'] ... 

hope helps.


Comments

Popular posts from this blog

How to remove text and logo OR add Overflow on Android ActionBar using AppCompat on API 8? -

html - How to style widget with post count different than without post count -

url rewriting - How to redirect a http POST with urlrewritefilter -