atom feed3 messages in com.selenic.mercurial-develHow Mercurial could be made working w...
FromSent OnAttachments
Adrian BuehlmannDec 8, 2010 12:53 am 
timelessDec 8, 2010 1:30 am 
Adrian BuehlmannDec 8, 2010 4:53 am 
Subject:How Mercurial could be made working with Virus Scanners on Windows
From:Adrian Buehlmann (adr@cadifra.com)
Date:Dec 8, 2010 12:53:17 am
List:com.selenic.mercurial-devel

I'm writing this to document how far I got in dealing with deleting open files on Windows.

Mercurial obviously deletes files in a lot of places, but it also does so to break up any hardlinks, which is a big theme in Mercurial due to its preference to create hardlinks as much as it can when cloning repositories.

Open files can happen at any time, either done by Mercurial itself (e.g. revlog lazy parser keeps the index file open), or caused by other programs like editors (although the sane ones probably quickly read the file and then close it again), but more importantly, and quite frequently, by the ubiquitous anti virus scanners.

As documented on

http://mercurial.selenic.com/wiki/UnlinkingFilesOnWindows

opening files on Windows by programs is mostly done in two flavors:

(A) blocking rename and deletion by other processes (B) allowing rename and deletion by other processes

'A' is done by python's built-in 'open' function, and 'B' is done by mercurial itself (windows.posixfile) and -- lo and behold -- by the majority of virus scanners, which includes the famous and gratis "Microsoft Security Essentials". Same for other popular ones (e.g. Avast).

It's easy to investigate whether your favorite virus scanner is doing 'B' or something else. Use the free Process Explorer by Microsoft and instruct it to log activity by your virus scanner. Then look at calls of the CreateFile Windows API function

http://msdn.microsoft.com/en-us/library/aa363858(VS.85).aspx

and look at the value for the parameter dwShareMode. If the bit FILE_SHARE_DELETE was set, then the opened file can be renamed and deleted (type 'B'). If your virus scanner does anything more harmful than that, then please get a different one.

Now, back to Mercurial. Method 'B' is the best we can get, but it is still causing a major problem: files that were opened using 'B' are sent into a ghost state if os.unlink is called on them. I've called that state "scheduled delete" on the UnlinkingFilesOnWindows wiki page.

Files in this ghost state block the filename for as long as any reading process doing 'B' holds it open. Which means no file can be created under that name again despite os.unlink was called on it. Pretty amazing.

I consider solving this problem a key point in getting mercurial to work with virus scanners on Windows.

How can it be done?

Files opened with method 'B' can be renamed. Processes holding the file open continue to hold it open under the new name. So if you os.unlink a file after rename, the "ghost file" state filename blocking happens for the new name.

So a trick to harden mercurial against AV scanners is to rename files to a random name before deleting them, so that the dreaded filename blocking (the ghost state) is done on a random name and not on the precious original name of the file, which may be needed again to recreate the file under the same name (util.opener does that to break up hardlinks when 'w'riting to files).

I'm writing these findings so that this information may be picked up by whoever is interested in trying to make mercurial work with av scanners on windows. Feel free to take this task.