一个抓取网站内容的函数,支持301 302跳转0+

9,981 views / 2010.01.18 / 4:04 下午

我们在抓取网站内容的时候,经常遇到稀奇古怪的防盗链,比如上次碰到一个站的图片地址是假的,访问后要301跳转一次才到真正的图片路径,这个真实的路径又做了防盗措施,判断referer是不是上个假的图片地址。用curl试了几次,终于整出一个函数,效果不错。

$curl_loops = 0;//避免死了循环必备
$curl_max_loops = 3;
 
function curl_get_file_contents($url, $referer='') {
global $curl_loops, $curl_max_loops;
$useragent = "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)";
if ($curl_loops++ >= $curl_max_loops) {
  $curl_loops = 0;
  return false;
} 
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);?curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_REFERER, $referer);
$data = curl_exec($ch);
$ret = $data;
list($header, $data) = explode("\r\n\r\n", $data, 2);
$http_code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
$last_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
curl_close($ch);
if ($http_code == 301 || $http_code == 302) {
  $matches = array();
  preg_match('/Location:(.*?)\n/', $header, $matches);
  $url = @parse_url(trim(array_pop($matches)));
  if (!$url) {
  ?$curl_loops = 0;
  ?return $data;
  } 
  $new_url = $url['scheme'] . '://' . $url['host'] . $url['path']
   . (isset($url['query']) ? '?' . $url['query'] : '');
  $new_url = stripslashes($new_url);
  return curl_get_file_contents($new_url, $last_url);
} else {
  $curl_loops = 0;
  list($header, $data) = explode("\r\n\r\n", $ret, 2);
  return $data;
} 
}
Categories: 感悟 Tags: , ,

ubuntu server 9.04 架设vpn 服务 手记2+

19,326 views / 2010.01.12 / 10:10 下午

刚在国外买了台服务器,第一个念头就是终于可是翻墙了,于是着手建立一个vpn服务器。OpenVPN在windows上还要客户端,不采用。L2TP/IPSec方式的太复杂,也不用。所以就选择PPTP方式,因为现在windows和Mac系统中都内建了相应的客户端。下面是我的安装日记。
服务器环境:Ubuntu 9.04 单网卡
呵呵,先sudo bash吧,虽然不安全,可是谨慎些,运行命令方便
首先安装pptp server
apt-get install pptpd
成功后配置conf文件
vi /etc/pptpd.conf
释放文件末端的 localip 和 remoteip 两个参数的注释,然后修改。这里,localip 是 VPN 链接成功后服务器的 ip 地址, remoteip 则客户端的可分配 ip 地址范围。下面是我的配置:
# (Recommended)
localip 10.100.0.1
remoteip 10.100.0.2-10
# or
#localip 192.168.0.234-238,192.168.0.245
#localip 192.168.0.234-238,192.168.0.245
#remoteip 192.168.1.234-238,192.168.1.245
然后要编辑/etc/ppp/pptpd-options文件,为vpn指定dsn服务器, 哈哈,我们使用Google Public DNS:
vi /etc/ppp/pptpd-options
修改
ms-dns 8.8.8.8
ms-dns 8.8.4.4

保存后,接下来配置用户名和密码了。修改/etc/ppp/chap-secrets文件,根据你的情况填写即可。具体解释如下:
第一列是用户名,第二列是服务器名(默认写pptpd 即可,注意与 pptpd-options 文件保持一致),第三列是密码,第四列是 IP 限制(不做限制用 * )。

最后重启pptpd服务,就可以生效了。

目前位置,我们只完成了一部分,因为这样只能访问服务器资源,其余内外网内容都无法访问。我们继续操作:
修改/etc/sysctl.conf,把ipv4 forward开启,方法是去掉
net.ipv4.ip_forward=1前面的注释,然后保存,运行sysctl –p.

root@duyipeng:~# sysctl -p
net.ipv4.ip_forward = 1
这样,我们的vpn server就算是搭建成功了。
如果依然不能访问外网,请使用iptables建立一个NAT, 方法如下:
apt-get intall iptables
iptables -t nat -A POSTROUTING -s 10.100.0.0/24 -o eth0 -j MASQUERADE
上面的24表示子网掩码,代表24个1.

如果要防止重启服务器后iptables丢失,先运行

iptables-save > /etc/iptables-rules

然后修改/etc/network/interfaces 文件,在eth0 下面加入

pre-up iptables-restore < /etc/iptables-rules

这样,服务器重启后,就能自动挂载规则了。

如此,应该可以解决问题。如果还是无法成功,你就改检查你的路由和防火墙了。Good Luck!

Categories: 感悟 Tags: , ,

在同一个窗口中打开不同的网址0+

12,088 views / 2009.12.31 / 2:02 下午

这是一个灵活的应用,如果你想在一个页面中,点击所有的链接,都在另外的同一个窗口中加载链接内容,可以用这种方式实现。

<html>
<head>
 
<script type="text/javascript">
function focusWindow() {
w = window.open("", "dyp");
w.focus();
 }
</script>
</head>
 
<body>
<a href="b.html" target="dyp" onclick="focusWindow()">B</a>
<a href="c.html" target="dyp" onclick="focusWindow()">C</a>
</body>
</html>

这种方法常用在优化用户体验方面,比较快捷方便。

Categories: 感悟 Tags:

php curl cookie 存取示例1+

29,667 views / 2009.12.02 / 1:01 下午

好多人发来消息询问curl存取cookie文件的问题,杜工并不觉得这是个难点,因为只看手册就可以很容易把握。下面给个例子,看完后就全都明了了:

<?php
$cookie_jar_index = 'cookie.txt';
 
$url = "http://www.71j.cn/perl/login.pl";
$params = "username=dudu&password=****";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_jar_index);
//curl_setopt($ch, CURLOPT_COOKIE, "fruit=apple; colour=red");
//上面代码是直接传递cookie信息,而非文件
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params); 
//curl_setopt($ch, CURLOPT_NOBODY, 1);//这个不能打开,否则无法生成cookie文件
ob_start();
curl_exec($ch);
curl_close($ch);
ob_clean();
 
$url = "http://www.71j.cn/perl/myfavorites.pl";
$ch2 = curl_init();
curl_setopt($ch2, CURLOPT_URL, $url);
curl_setopt($ch2, CURLOPT_COOKIEFILE, $cookie_jar_index);
ob_start();
curl_exec($ch2);
curl_close($ch2);
$rs = ob_get_contents(); //$rs就是返回的内容
ob_clean();
 
print_r($rs);
 
?>
Categories: 分享 Tags: , ,

discuz 7.2 嵌套外站用户通行证详解0+

21,191 views / 2009.11.22 / 1:01 上午

当我们使用discuz架设论坛的时候,往往需要把全站用户打通,即让其它产品线的用户与论坛无缝衔接起来。下面我来介绍下具体实现步骤。

Step 1
修改/register.php,在最开始加入:

require_once './include/common.inc.php';
 header("location:http://passport.通行证注册url/register.php?forward=" . $boardurl);
 exit;
 .....

目的是屏蔽discuz的注册入口,让用户调转到统一的通行证注册页面去。

同时不要忘记修改/include/js/common.js中的函数showWindow:

function showWindow(k, url, mode, cache) {
 if(k == 'register'){
 location.href='/register.php';
 return false;
 }
 ....

这样注册入口就全都跳转到通行证的注册页面了。

Step 2
在include/common.inc.php最后添加上判断代码,假定统一通行证的用户cookie为$_COOKIE[“UserInfo”]:

if(!$discuz_uid){
if($_COOKIE["UserInfo"]){
parse_str($_COOKIE["UserInfo"],$cookie_info);  //解析出用户信息,让dologin.php的处理
header("location:http://".$_SERVER["HTTP_HOST"]."/ dologin.php");
}
}

Step 3
下面是关键内容。在论坛根目录下创建dologin.php,内容及功能解释如下:

<?php
 
require_once './include/common.inc.php';
require_once DISCUZ_ROOT . './uc_client/client.php';
 
// COOKIE验证
if ($_COOKIE["UserInfo"]) {
//用户如果已经登录过,下面用统一通行证的cookie处理方法解析出用户信息
$username = .....;
$password = ......;
$email = ......;
$ResultCode = "0";
} else {
// 如果从论坛登录,则需要统一通行证验证
$username = $_POST["username"];
$password = $_POST["password"];
// 验证
$ResultCode = ....//如果验证成功返回0
$email = ....;//从通行证取到用户email
}
 
if ($ResultCode == "0") {
// 先看DZ用户表里是否有这条,如果有,且密码不一样,则更新密码(防止出现通行证用户修改密码后,DZ不能登陆);没有新插入一条
if ($loginfield == 'uid') {
$isuid = 1;
} elseif ($loginfield == 'email') {
$isuid = 2;
} else {
$isuid = 0;
}
 
$ucresult = uc_user_login($username, $password, $isuid, 1, $questionid, $answer);
list($tmp['uid'], $tmp['username'], $tmp['password'], $tmp['email'], $duplicate) = daddslashes($ucresult, 1);
$ucresult = $tmp;
 
if ($duplicate && $ucresult['uid'] > 0) {
if ($olduid = $db -> result_first("SELECT uid FROM {$tablepre}members WHERE username='" . addslashes($ucresult['username']) . "'")) {
require_once DISCUZ_ROOT . './include/membermerge.func.php';
membermerge($olduid, $ucresult['uid']);
uc_user_merge_remove($ucresult['username']);
} else {
return 0;
}
}
 
if ($ucresult['uid'] == -1) {
// 用户不存在,或者被删除
$uid = uc_user_register($username, $password, $email, $questionid, $answer, $onlineip);
if ($uid <= 0) {
fail();
}
 
$inviteconfig = array();
$query = $db -> query("SELECT * FROM {$tablepre}settings WHERE variable IN ('bbrules', 'bbrulestxt', 'welcomemsg', 'welcomemsgtitle', 'welcomemsgtxt', 'inviteconfig')");
while ($setting = $db -> fetch_array($query)) {
$$setting['variable'] = $setting['value'];
}
$invitecode = $regstatus > 1 && $invitecode ? dhtmlspecialchars($invitecode) : '';
if ($regstatus > 1) {
$inviterewardcredit = $inviteaddcredit = $invitedaddcredit = '';
@extract(unserialize($inviteconfig));
}
 
$groupinfo = $db -> fetch_first("SELECT groupid, allownickname, allowcstatus, allowcusbbcode, allowsigbbcode, allowsigimgcode, maxsigsize FROM {$tablepre}usergroups WHERE " . ($regverify ? "groupid='8'" : "creditshigher<=" . intval($initcredits) . " AND " . intval($initcredits) . "<creditslower LIMIT 1"));
 
$secques = $questionid > 0 ? random(8) : '';
$idstring = random(6);
$authstr = $regverify == 1 ? "$timestamp\t2\t$idstring" : '';
$password = md5(random(10));
$db -> query("INSERT INTO {$tablepre}members (uid, username, password, secques, adminid, groupid, regip, regdate, lastvisit, lastactivity, posts, credits, extcredits1, extcredits2, extcredits3, extcredits4, extcredits5, extcredits6, extcredits7, extcredits8, email, showemail, timeoffset, pmsound, invisible, newsletter)
VALUES ('$uid', '$username', '$password', '$secques', '0', '$groupinfo[groupid]', '$onlineip', '$timestamp', '$timestamp', '$timestamp', '0', $initcredits, '$email', '0', '9999', '1', '0', '1')");
 
$db -> query("REPLACE INTO {$tablepre}memberfields (uid, authstr $fieldadd1) VALUES ('$uid', '$authstr' $fieldadd2)");
} elseif ($ucresult['uid'] == -2) {
// 密码错
if (!uc_user_edit($username, '', $password, $email, 1)) {
fail();
}
list($uid, $username, $email) = uc_get_user($username);
} else {
$uid = $ucresult['uid'];
}
 
$member = $db -> fetch_first("SELECT m.uid AS discuz_uid, m.username AS discuz_user, m.password AS discuz_pw, m.secques AS discuz_secques,
m.email, m.adminid, m.groupid, m.styleid, m.lastvisit, m.lastpost, u.allowinvisible
FROM {$tablepre}members m LEFT JOIN {$tablepre}usergroups u USING (groupid)
WHERE m.uid='$ucresult[uid]'");
 
if (!$member) {
// 需要激活
fail();
}
 
$member['discuz_userss'] = $member['discuz_user'];
$member['discuz_user'] = addslashes($member['discuz_user']);
foreach($member as $var => $value) {
$GLOBALS[$var] = $value;
}
 
if (addslashes($member['email']) != $ucresult['email']) {
$db -> query("UPDATE {$tablepre}members SET email='$ucresult[email]' WHERE uid='$ucresult[uid]'");
}
 
if ($questionid > 0 && empty($member['discuz_secques'])) {
$GLOBALS['discuz_secques'] = random(8);
$db -> query("UPDATE {$tablepre}members SET secques='$GLOBALS[discuz_secques]' WHERE uid='$ucresult[uid]'");
}
 
$GLOBALS['styleid'] = $member['styleid'] ? $member['styleid'] : $_DCACHE['settings']['styleid'];
 
$cookietime = intval(isset($_POST['cookietime']) ? $_POST['cookietime'] : 0);
 
dsetcookie('cookietime', $cookietime, 31536000);
dsetcookie('auth', authcode("$member[discuz_pw]\t$member[discuz_secques]\t$member[discuz_uid]", 'ENCODE'), $cookietime, 1, true);
dsetcookie('loginuser');
dsetcookie('activationauth');
dsetcookie('pmnum');
 
$GLOBALS['sessionexists'] = 0;
 
if ($_DCACHE['settings']['frameon'] && $_DCOOKIE['frameon'] == 'yes') {
$GLOBALS['extrahead'] .= '<script>if(top != self) {parent.leftmenu.location.reload();}</script>';
}
 
$ucsynlogin = $allowsynlogin ? uc_user_synlogin($discuz_uid) : '';
if (!empty($inajax)) {
$msgforward = unserialize($msgforward);
$mrefreshtime = intval($msgforward['refreshtime']) * 1000;
include_once DISCUZ_ROOT . './forumdata/cache/cache_usergroups.php';
$usergroups = $_DCACHE['usergroups'][$groupid]['grouptitle'];
$message = 1;
include template('login');
} else {
if ($groupid == 8) {
showmessage('login_succeed_inactive_member', 'memcp.php');
} else {
showmessage('login_succeed', dreferer());
}
}
} else {
fail();
}
 
function fail() {
showmessage('undefined_action', null, 'HALTED');
}
 
?>

Step 4

在用户登录时,要清掉通行证的cookie。需要修改logging.php

if($action == 'logout' && !empty($formhash)) {
 
if($_DCACHE['settings']['frameon'] && $_DCOOKIE['frameon'] == 'yes') {
 
$extrahead .= '<script>if(top != self) {parent.leftmenu.location.reload();}</script>';
 
}
 
if($formhash != FORMHASH) {
 
showmessage('logout_succeed', dreferer());
 
}
 
$ucsynlogout = $allowsynlogin ? uc_user_synlogout() : '';
 
clearcookies();
 
setcookie("UserInfo", "", time() - 3600, "/", ".xxx.com", 1); //删除通行证那边的cookie

上面四步完成后,清掉discuz的数据和模板缓存就大功告成了。

Categories: 感悟 Tags: , ,