Welcome to Pegasus Mail & Mercury Sign in | Join | Help
"That should only take a few minutes"... [Sardonic Laughter]

I don't think there's any other profession quite like writing software - short of warfare, I can't think of any other area of human endeavour where the potential for casual disaster is so high.

For the last eighteen months, I have been involved in rewriting and modernizing a huge chunk of the 460,000 lines of code that makes up Pegasus Mail. I've written about this (I just can't bring myself to use the verb "blogged", I'm afraid) before, but I thought it might be mildly diverting for some of you to see just exactly how bizarre and off-the-wall this process can often get.

In the course of porting Pegasus Mail from the now ancient Borland compiler I have used for many years to Visual C++, there are various code changes that have been forced on me. One of them is quite basic - the code I use to get listings of the files in a directory. On the face of it, this was straightforward: I evaluated the problem and thought that it might take me a couple of hours to fix. You can see where this is going, I assume... It eventually took me almost five weeks of repeated debugging and testing to get this one simple function working. When I had finally solved it, I sent a message to my beta test team describing what had been involved: here is the text... (note that the reference to "VC12" is to a beta build of the program which is only available to the test team. At the time of writing, we're up to VC13, and are getting ready for a formal release of Pegasus Mail v4.5).
 

 


The problem with files not showing up was quite a lot more complicated, and is part of the ongoing saga of the Windows FindFirstFile and FindNextFile functions. To understand this problem (and it's worth summarizing it, I think), you need to have a little history.For the longest time, I have used the Borland compiler family to write my code. I never used much Borland-specific code, but some that I *did* use extensively was a pair of functions called findfirst and findnext, which were used to enumerate the files in a directory. They worked reliably for a long time, but it eventually became clear that they had some problems. Furthermore, there was no direct or portable equivalent of these functions in Visual C++. So, I decided to write my own functions to replace them, making my code much more portable.

Now, Windows has a group of functions called FindFirstFile, FindNextFile and FindClose, which on the surface appear to duplicate the operation of findfirst/findnext... On the surface. In reality, these function calls are quirky and buggy as hell. As an example, they treat wildcard characters quite differently depending on the underlying file system: so, the pattern "*.?" will return one set of files if the underlying file system is FAT, another if it's NTFS, and a third, different set if the file system is NetWare. What's more, different combinations of Windows and file system will return different sets of file attributes (so, a plain search on FAT WILL return files marked readonly, but the same search on NetWare WON'T return files marked readonly unless you specifically request them).

Finally, Windows defines an extensive set of file attributes - things like read-only, hidden, system, compressed and so forth... The problem is that they have kept defining attributes over the years, simply adding new ones whenever they want. As a result, it can be quite tricky to work out from its attributes whether a file is an "ordinary" file or not. To understand what I mean by this, consider that you typically WON'T be interested in files that are marked "Offline", because they either cannot be opened, or will take an inordinately long time to be opened because they have to be recovered from backing store. Similarly, files marked as a "Device" should probably never be accessed other than on explicit requirement.

The reason various of you have found that WinPMail is not seeing certain files on your system (typically files without the archive bit set) is because Windows is returning different sets of attributes for these files depending on the OS and file system... So, on NetWare systems, the archive bit is set or not set (no problem), but on NTFS, if the archive bit is not set, another bit called FILE_ATTRIBUTE_NORMAL *IS* set (a situation not true if the file is on a NetWare volume). All of this is interfering with the test I was using to work out whether a file is "ordinary" or not.

I have now completely abandoned the "ordinary file" test I was using and have changed it to an inverse test - I now check to see if a file is "unusual" (i.e, has certain specific attributes that I'm not interested in by default) and omit it if it is. This is a considerably easier and more deterministic test and using it bypasses all the bizarre vagaries of OS/file system combinations.

VC12, which I'll release tonight, implements the new filescanning code, and I'm 99.9% sure that I've finally got it sorted out. But it's symptomatic of how difficult things are getting that such a basic, fundamental piece of code has been so bloody awkward. I've wasted dozens of hours diagnosing and solving these problems - problems that simply shouldn't have existed in the first place. Hmph.



Sorry if this is tedious or over-technical - I just thought it might be interesting for you to see a little of the kind of thing that goes on "behind the scenes".

More on the v4.5 release soon.

Cheers!

-- David --

Posted: Monday, January 14, 2008 2:27 PM by David Harris

Comments

tigershark said:

Despite the problems, I'm happy to hear, that the next release will come soon.

# January 14, 2008 6:28 PM
Anonymous comments are disabled