- Issue created by @somebodysysop
- πΊπΈUnited States somebodysysop
So, in further testing, I discovered that AutoBan REGEXP does not accept regex start end end boundries:
It appears that this is the way to handle the word crawls (no beginning and ending word boundries):
[A-Za-z]+\.([0-9]{1,2})\.([0-9]{1,2})
But, need to know how to match in REGEXP if message starts with a particular pattern, like:
topics
- πΊπ¦Ukraine goodboy Kharkiv, Ukraine
Hi, Autoban uses SQL Regexp syntax, see https://dev.mysql.com/doc/refman/8.0/en/regexp.html#operator_regexp.
I've added a debug mode to the latest developer version. You can see the text of the request via the Test link, for example.The query mode has 2 options: LIKE/REGEXP
Also, you can use Wildcards or no. - πΊπΈUnited States somebodysysop
Thank you. I am using REGEXP mode.
According to this page: https://dev.mysql.com/doc/refman/8.0/en/regexp.html#operator_regexp
To match patterns at the start of log entries or any string in SQL using REGEXP, the caret (^) symbol is indeed used to anchor the pattern to the beginning of the string.
What I am saying is that when I enter
^topics
As my pattern, the test fails.
When I enter
topics
The test returns all the log entries that begin with /topics
The problem with using the latter, is that, technically, it will match something like "?topics=", which I DO NOT want matched.
Do you see the issue? I don't think AutoBan REGEXP mode recognizes the caret (^) symbol.
- πΊπΈUnited States somebodysysop
Update:
I did figure out the workaround:
/topics
Will get topics the beginning of the message. Of course, theoretically it will also get:
/page/topics
But for now, this works for me.
Please let me know if I am correct about the caret (^) issue.
- πΊπ¦Ukraine goodboy Kharkiv, Ukraine
I did the following:
- Updated the taxonomy Tags and received a message in DB log "Updated vocabulary Tags."
- Went to the Autoban Log Analyze (/admin/config/people/autoban/analyze) and I see an entry with columns values:
Type = 'taxonomy', Message raw = 'Updated vocabulary %name.' and Variables raw = 'a:1:{s:5:"%name";s:4:"Tags";}' - Created an Autoban rule with parameters:
Type = 'taxonomy', Message pattern = '^Updated', Threshold = 1 - The query mode was set as REGEXP.
I ran SQL query as
"SELECT "log"."hostname" AS "hostname", COUNT(log.hostname) AS "hcount" FROM {watchdog} "log" WHERE ("log"."type" = :db_condition_placeholder_0) AND (("log"."message" REGEXP :db_condition_placeholder_1) OR ("log"."variables" REGEXP :db_condition_placeholder_2)) GROUP BY "log"."hostname" HAVING (COUNT(log.hostname) >= :cnt)"
and got IP addresses as result.Pure SQL query is
SELECT log.hostname AS hostname, COUNT(log.hostname) AS hcount FROM watchdog log WHERE (log.type = 'taxonomy') AND ((log.message REGEXP '^Updated') OR (log.variables REGEXP '^Updated')) GROUP BY log.hostname HAVING (COUNT(log.hostname) >= 1)
'and I got the same result usingmysql
console.I did the same thing with the
name.$
expressionSo SQL REGEXP works and it's all in the expression itself. Perhaps the problem is in the escaping of special characters.
- πͺπ¨Ecuador jwilson3
I think I see your issue.
The Autoban module is more for scanning Watchdog log messages for behaviors that seem like repeatable patterns.
The message column for "page not found" errors in watchdog logs don't actually contain URLs like
/topic
. They contain a PHP "placeholder" string like@uri
. Additionally, the variables column contains a PHP serialized array likea:1:{s:4:"@uri";s:9:"/topic";}
. Autoban RegEx and LIKE queries both run against the raw database column contents (before placeholders are filled), therefore it will never match^/topic
.One workaround would be to use the colon and double quote characters used in the serialized string to indicate the first part of the URL: Eg,
:"/topic
which will turn up a match against the variables database column. This needs testing.On the other hand, if you want simple regex style banning from any request to specific URLs that doesnt involve scanning watchdog logs, try the Perimeter Defense module, which does exactly this. β
- πΊπ¦Ukraine goodboy Kharkiv, Ukraine
The watchdog table has 2 columns: untranslated text pattern (on English) and the serialized variables. The human-reading text is generating by t(pattern, variables) due to current language. We can use only the unified message pattern or the variables string with serialized format.
I don't see a simple solution yet.