Correct message pattern usage and AutoBan Banned IP Log?

Issue created by @somebodysysop
Comment over 1 year ago →
🇺🇸United States somebodysysop
So, in further testing, I discovered that AutoBan REGEXP does not accept regex start end end boundries:

It appears that this is the way to handle the word crawls (no beginning and ending word boundries):

[A-Za-z]+\.([0-9]{1,2})\.([0-9]{1,2})

But, need to know how to match in REGEXP if message starts with a particular pattern, like:

topics
Comment over 1 year ago →
🇺🇦Ukraine goodboy Kharkiv, Ukraine
Hi, Autoban uses SQL Regexp syntax, see https://dev.mysql.com/doc/refman/8.0/en/regexp.html#operator_regexp.
I've added a debug mode to the latest developer version. You can see the text of the request via the Test link, for example.

The query mode has 2 options: LIKE/REGEXP
Also, you can use Wildcards or no.
Comment over 1 year ago →
🇺🇸United States somebodysysop
Thank you. I am using REGEXP mode.

According to this page: https://dev.mysql.com/doc/refman/8.0/en/regexp.html#operator_regexp

To match patterns at the start of log entries or any string in SQL using REGEXP, the caret (^) symbol is indeed used to anchor the pattern to the beginning of the string.

What I am saying is that when I enter

^topics

As my pattern, the test fails.

When I enter

topics

The test returns all the log entries that begin with /topics

The problem with using the latter, is that, technically, it will match something like "?topics=", which I DO NOT want matched.

Do you see the issue? I don't think AutoBan REGEXP mode recognizes the caret (^) symbol.
Comment over 1 year ago →
🇺🇸United States somebodysysop
Update:

I did figure out the workaround:

/topics

Will get topics the beginning of the message. Of course, theoretically it will also get:

/page/topics

But for now, this works for me.

Please let me know if I am correct about the caret (^) issue.
Comment over 1 year ago →
🇺🇦Ukraine goodboy Kharkiv, Ukraine
I did the following:

Updated the taxonomy Tags and received a message in DB log "Updated vocabulary Tags."

Went to the Autoban Log Analyze (/admin/config/people/autoban/analyze) and I see an entry with columns values:
Type = 'taxonomy', Message raw = 'Updated vocabulary %name.' and Variables raw = 'a:1:{s:5:"%name";s:4:"Tags";}'

Created an Autoban rule with parameters:
Type = 'taxonomy', Message pattern = '^Updated', Threshold = 1

The query mode was set as REGEXP.

I ran SQL query as
"SELECT "log"."hostname" AS "hostname", COUNT(log.hostname) AS "hcount" FROM {watchdog} "log" WHERE ("log"."type" = :db_condition_placeholder_0) AND (("log"."message" REGEXP :db_condition_placeholder_1) OR ("log"."variables" REGEXP :db_condition_placeholder_2)) GROUP BY "log"."hostname" HAVING (COUNT(log.hostname) >= :cnt)" and got IP addresses as result.

Pure SQL query is
SELECT log.hostname AS hostname, COUNT(log.hostname) AS hcount FROM watchdog log WHERE (log.type = 'taxonomy') AND ((log.message REGEXP '^Updated') OR (log.variables REGEXP '^Updated')) GROUP BY log.hostname HAVING (COUNT(log.hostname) >= 1)'and I got the same result using mysql console.

I did the same thing with the name.$ expression

So SQL REGEXP works and it's all in the expression itself. Perhaps the problem is in the escaping of special characters.
Comment over 1 year ago →
🇪🇨Ecuador jwilson3
I think I see your issue.

The Autoban module is more for scanning Watchdog log messages for behaviors that seem like repeatable patterns.

The message column for "page not found" errors in watchdog logs don't actually contain URLs like /topic. They contain a PHP "placeholder" string like @uri. Additionally, the variables column contains a PHP serialized array like a:1:{s:4:"@uri";s:9:"/topic";}. Autoban RegEx and LIKE queries both run against the raw database column contents (before placeholders are filled), therefore it will never match ^/topic.

One workaround would be to use the colon and double quote characters used in the serialized string to indicate the first part of the URL: Eg, :"/topic which will turn up a match against the variables database column. This needs testing.

On the other hand, if you want simple regex style banning from any request to specific URLs that doesnt involve scanning watchdog logs, try the Perimeter Defense module, which does exactly this. →
Comment over 1 year ago →
🇺🇦Ukraine goodboy Kharkiv, Ukraine
The watchdog table has 2 columns: untranslated text pattern (on English) and the serialized variables. The human-reading text is generating by t(pattern, variables) due to current language. We can use only the unified message pattern or the variables string with serialized format.

I don't see a simple solution yet.

Correct message pattern usage and AutoBan Banned IP Log?

Problem/Motivation

Steps to reproduce

Proposed resolution

Remaining tasks

Comments & Activities