禁止搜索引擎抓取和收录

有时候我们会遇到这样的需求:不要让这个网站被收录。禁止收录?

方法一:通过<meta>标签实现禁止搜索引擎索引:

<meta name="robots" content="noindex"> //禁止所有搜索引擎索引
<meta name="googlebot" content="noindex"> //禁止google索引
<meta name="BaiduSpider" content="noindex"> //禁止百度索引

方法二:通过robots.txt实现禁止搜索引擎索引

User-agent: *
Disallow: /

方法三:通过nginx禁止搜索引擎的UA访问,添加一下代码

if ($http_user_agent ~* "Baiduspider")
{
return 403;
}

所有搜索引擎的UA,可以根据需求删减或增加

"qihoobot|Baiduspider|Googlebot|Googlebot-Mobile|Googlebot-Image|Mediapartners-Google|Adsbot-Google|Feedfetcher-Google|Yahoo! Slurp|Yahoo! Slurp China|YoudaoBot|Sosospider|Sogou spider|Sogou web spider|MSNBot|ia_archiver|Tomato Bot"


模拟百度蜘蛛访问本站
[root@docker02 ~]# UA="Mozilla/5.0 (compatible;Baiduspider/2.0; +http://www.baidu.com/search/spider.html)";
[root@docker02 ~]# curl -H "User-Agent: $UA" https://teddylu.xyz
<html>
<head><title>403 Forbidden</title></head>
<body bgcolor="white">
<center><h1>403 Forbidden</h1></center>
<hr><center>nginx</center>
</body>
</html>

禁止搜索引擎抓取和收录
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
Scroll to top
0
Would love your thoughts, please comment.x
()
x