- @mcdruid opened merge request.
- 🇬🇧United Kingdom mcdruid 🇬🇧🇪🇺
Rerolled the old patch against 2.1.x and put it into an MR.
There's a todo list in #3
I don't think the
_file
property is an issue any more - I've removed it as it wasn't being used for anything much.I've not tested this manually (with any of the different configurations) yet though, and the output from the scanner should be sanitised.
Moving to NR for the testing, but we shouldn't commit this without the sanitisation.
- 🇬🇧United Kingdom mcdruid 🇬🇧🇪🇺
There are quite a few linting errors between cspell / phpcs / phpstan.
A couple of them are introduced by the changes in the MR, but that's as a result of following the patterns / conventions that are already there e.g.
32 | WARNING | Property name "$_output" should not be prefixed with an underscore to indicate visibility | | (PSR2.Classes.PropertyDeclaration.Underscore) 32 | ERROR | Class property $_output should use lowerCamel naming without underscores | | (Drupal.NamingConventions.ValidVariableName.LowerCamelName)
...where there are other properties like
$_virus_name
that hit the same problems.It would probably make sense to address all of this in a dedicated issue, and just follow the existing patterns in this MR.
- 🇬🇧United Kingdom mcdruid 🇬🇧🇪🇺
Manually tested with executable mode; seems to work as expected.
Example log message when the scanner found a problem (with both verbose and debug enabled):
Virus Eicar-Signature detected in uploaded file /tmp/phpqQahST. Debug output: /tmp/phpqQahST: Eicar-Signature FOUND ----------- SCAN SUMMARY ----------- Known viruses: 8704988 Engine version: 1.0.7 Scanned directories: 0 Scanned files: 1 Infected files: 1 Data scanned: 0.00 MB Data read: 0.00 MB (ratio 0.00:1) Time: 10.072 sec (0 m 10 s) Start Date: 2025:03:14 11:57:30 End Date: 2025:03:14 11:57:40
It'd ideally be better if the original filename of the upload was logged instead of the tmp name, but that's a different issue.
Another example where I deliberately broke the clamscan executable - this is the sort of thing debug mode may be more useful for in reality:
Uploaded file /tmp/phpXYXjA4 could not be checked, and was deleted. Debug output: LibClamAV Error: cli_load(): Can't open file /var/lib/clamav/main.cvd LibClamAV Error: cli_loaddbdir: error loading database /var/lib/clamav/main.cvd LibClamAV Error: cli_loaddbdir: No supported database files found in /var/lib/clamav ERROR: Can't open file or directory ----------- SCAN SUMMARY ----------- Known viruses: 0 Engine version: 1.0.7 Scanned directories: 0 Scanned files: 0 Infected files: 0 Data scanned: 0.00 MB Data read: 0.00 MB (ratio 0.00:1) Time: 0.004 sec (0 m 0 s) Start Date: 2025:03:14 12:02:03 End Date: 2025:03:14 12:02:03
I'll try to do some similar tests with the daemon mode too, and socket if I have time (https://github.com/fwust/ddev-clamav looks like it'll make testing daemon mode pretty easy.. socket mode may involve a bit more manual set up).
- 🇬🇧United Kingdom mcdruid 🇬🇧🇪🇺
Daemon mode example log entries (again, verbose + debug enabled):
Uploaded file /tmp/phpuLzC2k checked and found clean. Debug output: stream: OK
Virus Eicar-Signature detected in uploaded file /tmp/phpnGu0G4. Debug output: stream: Eicar-Signature FOUND
...and if I tried to deliberately break the daemon set up..
Uploaded file /tmp/phpkCRZnu could not be checked, and was deleted. Debug output:
Hmm that's not so useful. In that case the daemon wasn't actually listening for connections at all. Would probably be useful to try to log something vaguely informative in that situation.
If I try to break the config of the daemon it just doesn't start up.. so I think not being able to connect is one of the more realistic problems.
The module does actually log messages when it can't connect:
https://git.drupalcode.org/project/clamav/-/blob/2.1.x/src/Scanner/Daemo...
Unable to connect to ClamAV TCP/IP daemon on clamav:3310
- 🇬🇧United Kingdom mcdruid 🇬🇧🇪🇺
This is a bit duplicative.. but for now I've added the same message about the failed connection to the daemon to the
$_output
property so that it gets logged as the scanner output when debug is enabled.I suppose we could omit the other log entry when debug is enabled, but I'm not certain that's a good idea; it might already be used for monitoring etc..
- 🇬🇧United Kingdom mcdruid 🇬🇧🇪🇺
Same deal for connection to the daemon by socket - I've added the same scanner output as for the tcp/ip mode.
- 🇬🇧United Kingdom mcdruid 🇬🇧🇪🇺
I've verified that if we inject something like an XSS payload into the scanner output, it gets logged verbatim.
For example (this is a low effort quick hacky test):
+++ b/src/Scanner/Executable.php @@ -65,6 +65,9 @@ class Executable implements ScannerInterface { // values greater than 2. Any value of 2 or greater means that the scan // failed, and the file has not been checked. exec($cmd, $output, $return_code); + + $output[] = '<script>alert("oh noes!")</script>'; + $this->_output = implode("\n", $output);
An example log entry in the db looks like:
> SELECT * FROM watchdog \G *************************** 78. row *************************** wid: 78 uid: 1 type: Clam AV message: Virus %virusname detected in uploaded file %filename. Debug output: %output variables: a:3:{s:9:"%filename";s:14:"/tmp/phpALzyP1";s:10:"%virusname";s:15:"Eicar-Signature";s:7:"%output";s:361:"/tmp/phpALzyP1: Eicar-Signature FOUND ----------- SCAN SUMMARY ----------- Known viruses: 8704988 Engine version: 1.0.7 Scanned directories: 0 Scanned files: 1 Infected files: 1 Data scanned: 0.00 MB Data read: 0.00 MB (ratio 0.00:1) Time: 10.241 sec (0 m 10 s) Start Date: 2025:03:14 14:01:59 End Date: 2025:03:14 14:02:09 <script>alert("oh noes!")</script>";} severity: 3
The traditional approach in Drupal is to filter on output ( https://www.drupal.org/docs/develop/security/why-does-drupal-filter-on-o... → ). The dblog admin UI escapes this appropriately.
We could pass the debug output through a filter before logging it.. but I'm not sure that's necessary, or best practice.
We don't want to fill syslog log entries with lots of html-encoded characters unnecessarily, for example (I also don't think the typical output of the clam scanner will contain anything bad.. especially as - per my earlier comment - the user-chosen filename is not currently included in these logs).
So I think this MR's ready for review.