-
-
Notifications
You must be signed in to change notification settings - Fork 363
/
bots.yml
327 lines (325 loc) · 8.62 KB
/
bots.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
---
200pleasebot: 200PleaseBot
360spider: 360Spider
abot: CrawlDaddy, abot
addthis: AddThis
adldxbot: Microsoft Bing Ads
admantx: ADmantX Platform Semantic Analyzer
adsbot-google: Google Adwords
adstxtcrawler: AdsTxtCrawler
advbot: AdvBot
ahrefsbot: Ahrefs backlinks research tool
alexa: Alexa Crawler
anderspink: AndersPinkBot
apache-httpclient: Java http library
apachebench: ApacheBench (ab)
apis-google: APIs-Google
appengine-google: Google App Engine
applebot: Apple Bot
appsignal: AppSignal Bot
archive.org_bot: Internet Archive (archive.org)
archiveteam archivebot: ArchiveTeam ArchiveBot
ask jeeves: Ask Jeeves
asynchttpclient: Java http and WebSocket client library
awe.sm: Awe.sm URL expander
baidu: Baidu
barkrowler: Barkrowler
bdcbot: Big Data Corp
bingbot: Microsoft Bing
bingpreview: Microsoft Bing preview
bitlybot: bit.ly bot
blekkobot: Blekkobot
blexbot: BLEXBot (webmeup)
[email protected]: Linkfluence bot
bubing: BUbiNG
bufferbot: BufferBot
buibui-checkbot: buibui
butterfly: Topsy Labs
buzzbot: Buzzbot
buzztalk: buzztalk
catchbot: CatchBot (catchbot.com)
check_http: Nagios monitor
checkly: Checkly
chrome-lighthouse: Chrome-Lighthouse
cipacrawler: CipaCrawler
cliqzbot: Cliqzbot
cloudflare: CloudFlare-AlwaysOnline
cmradar/0.1: CMRadar/0.1
coldfusion: ColdFusion http library
commoncrawl: CCBot
comodo ssl checker: COMODO SSL Checker
comodo-webinspector-crawler: Comodo
copypants: BotPants
crowsnest: Crowsnest
curabot: cura.yt
curl: curl unix CLI http client
dap/nethttp: DAP/NetHTTP
datafeedwatch: DataFeedWatch
datagnionbot: datagnion.com/bot.html
datanyze: Datanyze
daumoa: Korean portal and search engine indexing bot
developers.google.com/+/web/snippet/: Google Plus
diffbot: Diffbot
digitalpersona fingerprint software: HP Fingerprint scanner
domain re-animator bot: Domain Re-Animator Bot
domainsbot: DomainsBot
domaintunocrawler: DomainTuno
dotbot: Dot Bot
duckduckbot: Duck Duck Go
elb-healthchecker: AWS ELB HealthChecker
embedly: Embedly
eoaagent: EOAAgent
everyonesocialbot: EveryoneSocial
evrinid: Evri bot
exabot: Exalead's bot
exaleadcloudview: ExaleadCloudView
ez publish: eZ Publish Link Validator
facebookexternalhit: Facebook Bot
facebot: Facebook Bot
feedburner: RSS bot
feedfetcher-google: Google Feedfetcher
findxbot: Findxbot
flipboardproxy: FlipboardProxy
friendfeedbot: FriendFeed
fyrebot: Fyrebot
garlik: GarlikCrawler
genieo: Genieo Web filter bot
germcrawler: GermCrawler
getprismatic.com: getprismatic.com
gigabot: Gigabot spider
gimme60bot: Gimme60 (gimme60.com)
gimmeusabot: Gimme60 (gimme60.com)
go http package: Go http library
go-http-client: Go http client
google page speed insights: Google Page Speed Insights
google web preview: Google Instant Previews crawler
google-site-verification: Google Site Verification
google-structured-data-testing-tool: Google Structured Data Testing Tool
google-structureddatatestingtool: Google Structured Data Testing Tool
google-xrawler: Google Shopping
googlebot: Google Bot
googleimageproxy: Google Image Proxy
googlestackdrivermonitoring-uptimechecks: Google Stackdriver Monitoring - Uptime Checks
grapeshotcrawler: GrapeshotCrawler
gravitybot: Gravity Bot
hatena::bookmark: Hatena::Bookmark
heritrix: heritrix
https://developers.google.com/+/web/snippet: Google+ Snippet Fetcher
httrack: HTTrack
hubspot: HubSpot
ia_archiver: Internet Archive (WayBackMachine)
icoreservice: iCoreService
idmarch: idmarch.org/bot.html
implisensebot: ImplisenseBot
inagist: URL resolver
insieve: Insieve Bot
insitesbot: Insitesbot
instapaper: Instapaper
istellabot: IstellaBot
jaunt: Jaunt - Java Web Scraping & JSON Querying
jetslide: Jetslide
jobseeker: jobseeker.com.au/bot.html
jooble: Jooble
js-kit: URL resolver
kemvibot: Kemvi
kimengi: Kimengi Bot
knows.is: knows.is
kojitsubot: Kojitsubot
komodiabot: KomodiaBot
kraken: kraken
laconica: Laconica
lijit crawler: Lijit
linkdexbot: Linkdex Bot
linkedinbot: LinkedIn
linkscrawler: LinksCrawler
linode: Linode Longview
lipperhey: Lipperhey
livelapbot: Livelapbot
loadtimebot: Load Time Bot
longurl: URL expander service
ltx71: ltx71.com
lumibot: Lumibot
magpie-crawler: magpie-crawler
mail.ru_bot: Mail.ru Bot
mappydata: Mappy
mastodon: Mastodon URL expander
mauibot: MauiBot
meanpathbot: meanpath
mediapartners-google: Google Adsense bot
megaindex.ru: MegaIndex
memorybot: mignify.com/bot.html
metauri: MetaURI
mfe_expand: Mcafee spider
mir web crawler: MIR web crawler
mj12bot: Majestic-12 spider
mojeekbot: Mojeek UK search crawler
ms search 6.0 robot: MS Search 6.0 Robot
msnbot-media: Microsoft media bot
msnbot: Microsoft bot
nerdybot: NerdyBot
netcraft: Netcraft
netstate: netEstate NE Crawler
netvibes: Personalized dashboard bot
netzcheckbot: netzcheck
newrelicmonitor: NewRelic monitor
newrelicpinger: NewRelicPinger
newsme: newsme
niki-bot: niki-bot
ning: NING - Yet Another Twitter Swarmer
nutch: Apache search spider
openhosebot: OpenHoseBot
orangebot: OrangeBot
paessler: paessler.com - PRTG Network Monitor
pagesinventory: pagesinventory.com
panopta: Monitoring service
paperlibot: PaperLi
peerindex: peerindex
percolatecrawler: PercolateCrawler
perfectmarketkwtbot: PerfectMarket
phantomjs: PhantomJS
pingdom: Pingdom monitoring
pinterest: Pinterest
plukkie: botje.com/plukkie.htm
pr-cy.ru: PR-CY.RU
privacyawarebot: PrivacyAwareBot
proximic: Proximic Spider
psbot-page: Picsearch
pu_in: Pu_iN Crawler
publiclibraryarchive.org: publiclibraryarchive.org
pycurl: Python http library
python-httplib2: Python-httplib2
python-requests: Python http library
python-urllib: Python http library
queryseeker: QuerySeekerSpider
quick-crawler: Quick-Crawler
quicklook: QuickLook
re-animator: Domain Re-Animator Bot
readability: Readability
rebelmouse: RebelMouse
redditbot: Reddit Bot
relateiq: RelateIQ
riddler: Riddler Bot
rogerbot: SeoMoz spider
rssmicro: RSS/Atom Feed Robot (rssmicro.com)
scouturlmonitor: ScoutURLMonitor
scrapy: Scrapy
screaming frog seo spider: Screaming Frog SEO Spider
searchmetricsbot: SearchmetricsBot
semanticbot: Semanticbot
semrushbot: SEO analysis bot
seo-audit: seo-audit-check-bot
seobilitybot: SeobilityBot
seodiver: SEOdiver
seokicks: SEOKicks
seznambot: SeznamBot
shopwiki: ShopWiki
shortlinktranslate: Link shortener
showyoubot: Showyou iOS app spider
siege: Joe Dog Siege
sistrix: SISTRIX
sitecheck: SiteCheck sitecrawl
siteuptime: Site monitoring services
skypeuripreview: SkypeUriPreview
slack-imgproxy: Slack Image Proxy
slack-linkexpanding: Slack Link Expanding
slack: Slack Link Expanding
slackbot: Slackbot
slurp: Yahoo spider
smtbot: SimilarTech
snapchat: Snapchat
socialrank: SocialRankIOBot
sogou: Chinese search engine
spbot: OpenLinkProfiler
spinn3r: Spinn3r aggregator
sputnikbot: SputnikBot
squider: Squider
statuscake: StatusCake
swiftbot: Swiftype Bot
tangibleebot: TangibleeBot
teeraid: TeeRaidBot
test certificate info: C http library?
the knowledge ai: Knowledge AI Bot
tineye: TinEye Bot
traackr: Traackr Bot
trendictionbot: Trendiction Search
trendsmap: Trendsmap Resolver
turnitinbot: TurnitinBot
tweetedtimes: The Tweeted Times
tweetmemebot: TweetMeMe Crawler
twikle: Social web search bot
twitjobsearch: TwitJobSearch
twitmunin: Twitmunin
twitterbot: Twitter URL expander
twurly: Twurly
typhoeus: Typhoeus
umbot: uberMetrics
unwindfetch: Gnip
updown: Updown.io monitor
uptimerobot: Uptime Robot
vagabondo: Vagabondo
vb project: Visual Basic
vigil: Vigil
vkshare: VKontake Sharer
voilabot: VoilaBot
vrcrawler: Venture Radar
wasalive-bot: Wasalive Bots
watchsumo: WatchSumo
wbsearchbot: Ware Bay Best Buys
webceo: online-webceo-bot
webscout: Webscout
wesee: WeSEE
wget: wget unix CLI http client
whatsapp: WhatsApp
wikido: WikiDo
woorank: WooRank
wordpress: WordPress spider
woriobot: woriobot
wormly: WormlyBot
wotbox: Wotbox
xenu link sleuth: Xenu Link Sleuth
xing-contenttabreceiver: Xing bot
xovibot: XoviBot
yacybot: YaCy
yahoo-ad-monitoring: Yahoo Ad monitoring
yandex: Yandex
yanga: Yanga WorldSearch Bot
yeti: Naver Corp
yourls: YOURLS
zabbix: Zabbix
zelist.ro: feed parser
zibb: ZIBB spider
zitebot: Zite
zoombot: ZoomBot
zoominfobot: ZoominfoBot
zyborg: Zyborg
# AI Crawlers
# https://darkvisitors.com
amazonbot: Amazon
anthropic-ai: Anthropic-AI
applebot: Apple
bytespider: TikTok
ccbot: Common Crawl
chatgpt-user: ChatGPT
claude-web: Anthropic-AI
cohere-ai: Cohere
diffbot: Diffbot
facebookbot: Facebook
google-extended: Google
googleother: Google
gptbot: ChatGPT
omgili: Webz.io
perplexitybot: Perplexity
webz.io: Webz.io
youbot: You.com
# Generic lib user agents go here.
httpie: HTTPie
eventmachine httpclient: Ruby http library
go 1.1 package http: Go 1.1 package http
htmlparser: HTMLParser
http_request2: HTTP_Request2
httpclient: HTTPClient
jakarta commons: Jakarta Commons HttpClient
java: Generic Java http library
libwww-perl: Perl client-server library
lwp-trivial: Another Perl library
ruby: Ruby