Robots.txt Handler
Purpose
A robots.txt file specifies which directories may be indexed by robots (spiders) and which may not. The robots for most search engines follow the instructions of the robots.txt file, however there’s no guarantee of this.
The Robots.txt Handler generates a virtual robots.txt file, which is then supplied as the result of the query of the URL /robots.txt.
- This makes any robots.txt file already included in your web application ineffective.
- For more information regarding adding and editing Handlers, see
Editing Handlers.
Severity
Events triggered by this handler are given the severity: low. (For details on severity levels, see Severity of Events Triggered by Handlers).
Recommendations for use
This handler is only useful for paths. For simplified configuration you can also carry out the basic setting in the Anti Spider Wizard initially. Typical, known User Agents are already preconfigured there, and you configure the Check User Agent Handler and the Required Header Field Handler at the same time in the same operation. Then edit the Robots.txt Handler for the specific path if required.
Attributes
Attribute | Meaning |
---|---|
allowedAgents |
List of permitted User Agents. |
usertext |
Optional: Here you can specify some text that vWAF adds to the log file entries created by this handler. You can use this, for example, to document why you've added the handler to your configuration, and how the handler is intended to behave. |
enable logging |
Disable this option if you do not want vWAF to create a log file entry when the handler is executed. This can be useful to keep log files smaller in case the handler creates a large number of entries but you don't need these entries. When in detection mode, disabling logging de facto makes the handler ineffective. Disabling logging also prevents the actions of the handler from being taken into account for the Top-10 lists in Attack Analysis, and from being listed in Reports. To decrease the size of the log files, also consider to enable reduced logging, which excludes all non-handler-related information from the log files (see Editing Applications). |