geminispace.info

gemini search engine
git clone https://git.clttr.info/geminispace.info.git
Log (Feed) | Files | Refs (Tags) | README | LICENSE

commit c10da9f7bfe5ef7395f1b679e91bf329073439ff
parent 44f6e6250611aba9dd3557eba7326b67d4c4249e
Author: Natalie Pendragon <natpen@natpen.net>
Date:   Fri,  5 Jun 2020 06:46:55 -0400

[crawl] Remove manual exclusions for alexschroeder.ch

They updated their robots.txt, so now the Disallow lines are parsing
correctly.

Diffstat:
Mgus/crawl.py | 10----------
1 file changed, 0 insertions(+), 10 deletions(-)

diff --git a/gus/crawl.py b/gus/crawl.py @@ -100,16 +100,6 @@ EXCLUDED_URL_PREFIXES = [ # Geddit "gemini://geddit.pitr.ca/post?", "gemini://geddit.pitr.ca/c/", - - # alexschroeder.ch b/c its robots.txt isn't working... - "gemini://alexschroeder.ch/map/", - "gemini://alexschroeder.ch/do/rc", - "gemini://alexschroeder.ch/do/rss", - "gemini://alexschroeder.ch/do/new", - "gemini://alexschroeder.ch/do/more", - "gemini://alexschroeder.ch/do/tags", - "gemini://alexschroeder.ch//do/match", - "gemini://alexschroeder.ch/do/search", ] EXCLUDED_URL_PATHS = [