poker10 β credited mikeytown2 β .
Day job is C# for me these days. Anyway advagg bundler works based off the pages hit; crawl your test environment and it'll mimic production. Css/js can be added to any page in drupal in various ways. There's no way to know how the assets will be added in what order. That's a limit of D7. It has to be added and then advagg can learn and react to it.
There's a hook in the bundler that'll allow you to modify what you need. The current way is the bundler tries to have equal number of files in each bundle. Due to minification and gzip it's hard to predict file size after the fact
Also see #1377740: file_unmanaged_move() should issue rename() where possible instead of copy() & unlink() β and #818818: Race Condition when using file_save_data FILE_EXISTS_REPLACE β .
In terms of headers being different I don't think it's that big of a deal. Here is my handler for sending out a request from AdvAgg http://cgit.drupalcode.org/advagg/tree/advagg.missing.inc#n195.
Not sure if this is an issue in D8 but in D7 drupal_tempnam() does not pass along the subdir if using a stream wrapper. Here's the code from AdvAgg to handle this situation
// Corect the bug with drupal_tempnam where it doesn't pass subdirs to
// tempnam() if the dir is a stream wrapper.
$scheme = file_uri_scheme($uri_dir);
if ($scheme && file_stream_wrapper_valid_scheme($scheme)) {
$wrapper = file_stream_wrapper_get_instance_by_scheme($scheme);
$dir = $wrapper->getDirectoryPath() . '/' . substr($uri_dir, strlen($scheme . '://'));
$uri = $dir . substr($uri, strlen($uri_dir));
}
else {
$dir = $uri_dir;
}
You'll want to write to a temp file in the target dir first before renaming so this will need to be fixed as well.
I have a lot of extra code in advagg to handle the fact that file_unmanaged_save_data is not atomic, thanks for the tip! http://cgit.drupalcode.org/advagg/tree/advagg.missing.inc#n979
In terms of deployments it looks like we both came to the same conclusion; do the best that we can, but don't write the file (I still need to make this happen).
Quick note that this can cause a race condition if you're deploying code and have 3 or more webheads.
Request comes in on webserver #1 which has the new code; the request for the css file hits webserver #3 which has the old code; file gets generated with the old code. In rare circumstances this can result in a zero byte aggregate being created (new file deployed, which is the only file iin the aggregate).
Heads up, the D7 version of AdvAgg is now in beta. It stores the files and order used in the creation of the aggregate, the a hash of all the files content (versioning), and any setting that affects the output of an aggregate (CSS/JS minification, CDN settings, embedded images, http/https, etc). Also has stampede protection, backports D8 JS code, and is coded to reduce disk I/O.
D7 AdvAgg is not blocked; other than me needing time to do it. The D7 version of AdvAgg will rely on HTTPRL. Just stating that there is a lot going on.
This thread is for a D8 solution; back porting to 7x would be very hard to do. I've been working in #1447736: Adopt Guzzle library to replace drupal_http_request() β and in Comparison of HTTP Client Libraries. I've concluded that Guzzle is the way forward; it offers what we're looking for in a HTTP Client. For the one thing it was lacking at the time, Non Blocking Requests, the author made it happen. That's huge. If we come across other issues when integrating it it's nice to know that mtdowling is willing to help.
At the core of AdvAgg is a whole lot of hooks to make it pluggable and the ability to generate a CSS/JS file based off the filename.
My objection to #47 is everything is stored in a variable instead of a database table. Having things in 2 tables like I have in AdvAgg allow for thing like the bundler to work correctly and for smart aggregate flushing. #55 from my understating deals more with the mechanics of how the hook_menu of this works; which is covered in WSCCI.
I would be ok with #352951: Make JS & CSS Preprocessing Pluggable β , but I think shipping with something like AdvAgg (doesn't have to be AdvAgg) as the default would be good for the drupal community.
Note: There are 139 followers for the D7 patch of advagg #1171546: AdvAgg - D7 Port/Re-write β and over 1,600 users of AdvAgg.
I still believe using a database table for CSS/JS aggregates is the correct way to do this. Getting a lot of what AdvAgg does in core is going to take some work if we want to go that route.
A better HTTP client via HTTPRL is the first step (advagg generates the aggregates in the background (multiprocess php)). I'm pretty happy with what HTTPRL currently does so creating a core patch to incorporate it is where I would start (HTTPRL is born out of AdvAgg code). Next step is to make it plugable #64866-37: Pluggable architecture for drupal_http_request() β so different backends can be put in place if cURL is desired.
The current limitation that AdvAgg has is it doesn't take advantage of multi-sites. If we have the same CSS aggregate used across multiple sites on our server, taking advantage of that will make things a lot snappier (only need to generate one aggregate) and save disk space. This would require a global files folder, one that doesn't depend on the current site being accessed. Global storage of the old files that need to be deleted could be bypassed if we have a REST API for querying if the aggregate is old. This would require one site to be aware of all the other sites inside of the multi-site install so it knows what sites it should query via HTTP.
Once a better HTTP client is in core and (optional) a global files directory is available I might have the D7 version of AdvAgg done at which point we could create a patch for D8 core. This is my recommendation for core's CSS/JS aggregation. It's a big undertaking but a lot of what AdvAgg does is what we're trying to do. It uses the same codepath for both CSS and JS; S3/CDN friendly file-naming; caching for faster page generation times; etc... it does a lot, thus it would be a large patch. Gotta run...
@pounard
You should also include md5 if you haven't done so as a way to test if a file has changed. Not all filesystems keep an accurate mtime. Do you want to team up? Sounds like your starting to port a lot of advagg's code into D7 & into your module. I already have my first request for D7 version of advagg [#31171546]. We both have the same goal, we should work together. PS you need to call clearstatcache() before using filemtime().
@pounard
I never use master; here is the 6.x-1.x Tree. advagg.missing.inc is what your interested in most likely.
In terms of the files existence, htaccess handles that; I use locks.inc for locking; it stalls then 307's once the file is generated.
@pounard
What are your thoughts about my side project http://drupal.org/project/advagg?
In terms of matching the files directory and intercepting it before menu_execute_active_handler fires check out http://drupal.org/project/files_proxy (runs in hook_init). These are done in D6, but once advagg is reaches 1.0, I will be working on a 2.0 release for 7.x; aggregation changed a lot between D6 & D7 so having a 2.x release makes sense.
6.x version that covers just about everything: http://drupal.org/project/advagg If your wondering I've been kicking this idea around in my head for about a year now: http://groups.drupal.org/node/53973. Couple of things this does that might interest this crowd that is not listed in the readme.
- Will test the CDN and non CDN url path to see if the advagg 404 menu hook is called.
- Implements fast 404s on a complete miss.
- Request is sent async to the newly created bundle on page generation. That means the bundle is generating in the background while the page is getting generated on the server... yes multi-threading in short.
- In theory on demand generation will work with clean URLs disabled.
I'm sure I've missed some others... in short advagg rocks! Feedback is greatly appreciated.