@noelhunter: do you, by any chance, have your cron jobs running as root?
That would justify the behaviour you describe: AOD_Index runs as root (it’s started by a cron job) and creates files and directories. While it’s indexing (I’ve watched it happen) it creates temporary files and then deletes them: that explains why you couldn’t find the files when you looked, and why the problem solved itself later.
You can check the user that is running your cron jobs by doing commands like this:
you get the point, the “-u” option is followed by a username. Try “root” and also the user that your web server uses.
Another approach would be to grep the syslog for “cron” entries:
sudo cat /var/log/syslog | grep -i cron
(that might vary in your installation, the log can be elsewhere, or you might not even have a syslog).
I would be very interested in knowing what you find because of other investigations I’m making regarding permissions issues. Please report back. Thanks!
With the update to 7.8.1, the errors went away. However, the index was not working. Replacing the AOD_Index directory with a fresh copy from 7.8.1 fixed the problem.
Thanks. It would still be useful for you to check which user your cron jobs are running under. It’s a quick check and it might save you future problems.
The cron jobs are running under the user www-data, and have been for some time. However, in the past, more than six months ago, they did run as root. When we fixed that, we manually changed the bad permissions. That corrected some of the problems, but apparently it (or something) had corrupted the AOD_Index files in a way which could not be repaired simply by changing permissions.
I think that story makes sense: while cron was running as root, permissions got messed up; while permissions were messed up, Index failed to update correctly and was messed up also; later, you had to fix all those three things one by one: get cron to run as www-data, get permissions right, get Index rebuilt.
If you ever need to rebuild the Index again, you don’t have to copy it from a fresh install, you can just delete the directory and it will recreate itself. I have a couple of posts here on the forums detailing that and giving some advice.
A final word of warning: I recommend temporarily disabling the cron jobs during upgrades, this will decrease your odds of broken installations.
Well, the problem is back with the 7.8.2 upgrade. All the previous fixes are in place. To clarify what happens, when we run the permission check, it finds files in modules/AOD_Index/Index/Index owned by root
When we inspect the directory for these files, they do NOT exist. Either they are temporary files created by the index process, or they are somehow cached. If we watch that directory, its contents are constantly changing, with 100+ files created / deleted every few seconds. My guess is that the check process sees them, and then gets 0 as the owner id, because the files has been deleted.
If I disable the cron job altogether (and let it finish), then the file permissions check passes.
I have seen that same behavior with the Index - lots of files get created and then deleted while it’s indexing.
I use this command to check permissions degradation:
tree -iudpf modules/AOD_Index
or even with a dynamic view across time (refreshes every second):
watch -n 1 tree -iudpf modules/AOD_Index
You hypothesis for why they show as errors sounds interesting:
I’d never thought of that possibility, it would be interesting to check the code one day, to see if it’s possible.
Of course, I assume you are REALLY sure your not still running cron as root. : - )
Anyway, as you can see by the Github issues, I really think cron should be disabled while Upgrading, so if you do that you shouldn’t have any more problems…
Does grab a list of all files, and then loops through them. It checks if they are not writable, then gets the file owner ID, and the pwname of the owner. I imagine if the file disappears between the call to uwFindAllFiles and the isWriteable check, it would return false, which is what I am seeing. When I look for the file, it has always been deleted before I can check. And all of the files that do exist are owned by www-data.
A fix might be to recheck isReadable or exists before flagging a file permissions error. Another fix would be to exclude the Index directory.
Now that I understand all of this, I will just disable cron before I run it.
Not including the AOD_Index would not be a good idea, actually people have permissions problems in that directory (real ones - when they run cron as root), and functionality breaks, so we wouldn’t want that to go unnoticed, and undiagnosed.
I think rechecking for file existence immediately after is_writable would be a good solution.
Do you want to do it on Github? Once you have working code (which you develop and test locally), it’s as easy as this. Remember to switch to hotfix branch before you start the edit.