//usr/lib64/lib64//lib64/python2.7//ó
ÙœSec @ s} d Z d d l Z d d l Z d g Z d d d „ ƒ YZ d d d „ ƒ YZ d d
d „ ƒ YZ d e j f d
„ ƒ YZ d S( s< robotparser.py
Copyright (C) 2000 Bastian Kleineidam
You can choose between two licenses when using this package:
1) GNU GPLv2
2) PSF license for Python 2.2
The robots.txt Exclusion Protocol is implemented as specified in
http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html
iÿÿÿÿNt RobotFileParserc B sb e Z d Z d d „ Z d „ Z d „ Z d „ Z d „ Z d „ Z d „ Z d „ Z
d
„ Z RS( ss This class provides a set of methods to read, parse and answer
questions about a single robots.txt file.
t c C s> g | _ d | _ t | _ t | _ | j | ƒ d | _ d S( Ni ( t entriest Nonet
default_entryt Falset disallow_allt allow_allt set_urlt last_checked( t selft url( ( s# /usr/lib64/python2.7/robotparser.pyt __init__ s
c C s | j S( s· Returns the time the robots.txt file was last fetched.
This is useful for long-running web spiders that need to
check for new robots.txt files periodically.
( R ( R
( ( s# /usr/lib64/python2.7/robotparser.pyt mtime s c C s d d l } | j ƒ | _ d S( sY Sets the time the robots.txt file was last fetched to the
current time.
iÿÿÿÿN( t timeR ( R
R ( ( s# /usr/lib64/python2.7/robotparser.pyt modified) s c C s/ | | _ t j | ƒ d d !\ | _ | _ d S( s, Sets the URL referring to a robots.txt file.i i N( R t urlparset hostt path( R
R ( ( s# /usr/lib64/python2.7/robotparser.pyR 1 s c C s¯ t ƒ } | j | j ƒ } g | D] } | j ƒ ^ q" } | j ƒ | j | _ | j d k rk t | _ n@ | j d k r† t | _ n% | j d k r« | r« | j | ƒ n d S( s4 Reads the robots.txt URL and feeds it to the parser.i‘ i“ i iÈ N( i‘ i“ (
t URLopenert openR t stript closet errcodet TrueR R t parse( R
t openert ft linet lines( ( s# /usr/lib64/python2.7/robotparser.pyt read6 s
c C sA d | j k r- | j d k r= | | _ q= n | j j | ƒ d S( Nt *( t
useragentsR R R t append( R
t entry( ( s# /usr/lib64/python2.7/robotparser.pyt
_add_entryD s c C s d } d } t ƒ } xä| D]Ü} | d 7} | s~ | d k rP t ƒ } d } q~ | d k r~ | j | ƒ t ƒ } d } q~ n | j d ƒ } | d k r¦ | | } n | j ƒ } | s¾ q n | j d d ƒ } t | ƒ d k r | d j ƒ j ƒ | d