Tuesday 15 July 2014

Cannot show the downloaded webpage with proper encoding using PHP -


I have to get the content of a Persian page and some users have to show a part of that page. The problem is that after filtering the content of the page I can not show the content with the proper encoding. The webpage is located on sena.ir and here is the screen shot of the original webpage which I want to show:

And here's what I found:

", $ Header =" ") {if (! Isset ($ timeout)) $ timeout = 30; $ Curl = curl_init (); If (strrest ($ Referrer, ": //")) {curl_setopt ($ curl, CURLOPT_REFERER, $ referer); } $ Header [] = 'Accept: Image / GIF, Image / X-bitmap, Image / JPEG, Image / PJPAG'; $ Header [] = 'Connection: Keep' Elive '; $ Header [] = 'content-type: app / x-www-form-urlencoded; Charset = utf-8 '; // I have tried - as soon as there is no chance $ user_agent = 'Mozilla / 4.0 (compatible; MSIE 7.0; Windows NT 5.1; NAT CLR 1.0.3705; NAT CLR 1.1.4322; Media Center PC 4.0) ; $ Compression = "jizip"; Curl_setopt ($ curl, CURLOPT_HTTPHEADER, $ header); Curl_setopt ($ curl, CURLOPT_HEADER, 0); Curl_setopt ($ curl, CURLOPT_USERAGENT, $ user_agent); Curl_setopt ($ curl, CURLOPT_RETURNTRANSFER, 1); Curl_setopt ($ curl, CURLOPT_FOLLOWLOCATION, 1); Curl_setopt ($ curl, CURLOPT_POST, 0); Curl_setopt ($ curl, CURLOPT_ENCODING, $ compression); Curl_setopt ($ curl, CURLOPT_TIMEOUT, 300); Curl_setopt ($ curl, CURLOPT_SSL_VERIFYHOST, 0); Curl_setopt ($ curl, CURLOPT_SSL_VERIFYPEER, 0); Curl_setopt ($ curl, CURLOPT_URL, $ url); $ Html = curl_xax ($ curl); Curl_close ($ curl); Return $ Html; } $ Content = getPage ("http://sena.ir/"); $ P1 = Straps ($ content, '' table cell spacing = "3" cellpadding = "3" width = "100%" range = "0" & ​​gt; '); $ P2 = Stropo ($ Content, "& lt; / table & gt;", $ p1); $ Content = substr ($ content, $ p1, $ p2- $ p1); Counterpart $ content; The output problem was not a problem since the proxy-like function, HTML and encoding declaration. Deletes the header, so you need to add these lines before filtering the filtered data:

  & lt; Html lang = "fa" & gt; & Lt; Top & gt; & Lt; Meta http-equiv = "content-type" content = "text / html; charset = UTF-8" & gt;  

No comments:

Post a Comment