php - Parsing Javascript on server -
i trying create a basic web crawler looks links adverts.
i have managed find script uses curl contents of target webpage
<?php $ch = curl_init("http://www.nbcnews.com"); $fp = fopen("source_code.txt", "w"); curl_setopt($ch, curlopt_file, $fp); curl_setopt($ch, curlopt_header, 0); curl_exec($ch); curl_close($ch); fclose($fp); ?> i found 1 uses dom
<?php $html = file_get_contents('http://www.nbcnews.com'); $dom = new domdocument(); @$dom->loadhtml($html); // grab on page $xpath = new domxpath($dom); $hrefs = $xpath->evaluate("/html/body//i"); ($i = 0; $i < $hrefs->length; $i++) { $href = $hrefs->item($i); $url = $href->getattribute('href'); echo $url.'<br />'; } ?> these great , feel i'm heading in right direction except quite few adverts displayed using js , it's client side, isn't processed , see js code , not ads.
basically, there way of getting js execute before start trying extract links?
thanks
Comments
Post a Comment