Bug in retrieving and modifying pages using xpath


I've written some code to modify webpages using xpath in php. The code below gets an html page, retrieves a part of it, and deletes a part of page.

It works in some scenarios, such as: http://chijoori.ir/excel-tutorial/

but fails in others such as: http://delbaraneh.com/decorations/the-latest-interior-96/





$dom = new DOMDocument();
$xpath = new DOMXPath($dom);


$dom2 = new DOMDocument();
@$dom2->loadHTML(mb_convert_encoding($v, 'HTML-ENTITIES', 'UTF-8'));
$xpath2 = new DOMXPath($dom2);

$elements =$xpath2->query($delete_xpath);
    foreach($elements as $element){

echo $fullcontent;

function browser_test($url){

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible; CrawlBot/1.0.0)');
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT , 5);
    curl_setopt($ch, CURLOPT_TIMEOUT, 5);
    curl_setopt($ch, CURLOPT_ENCODING, "");
    curl_setopt($ch, CURLOPT_AUTOREFERER, true);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);    # required for https urls
    curl_setopt($ch, CURLOPT_MAXREDIRS, 15);
    $html = curl_exec($ch );
    $status = curl_getinfo($ch );
    curl_close($ch );
    if($html=="" || !$html){
    return $html;

Show source
| dom   | php   | xpath   2017-01-07 19:01 0 Answers

Answers to Bug in retrieving and modifying pages using xpath ( 0 )

Leave a reply to - Bug in retrieving and modifying pages using xpath

◀ Go back