Pegasus Mail & Mercury

Welcome to the Community for Pegasus Mail and
The Mercury Mail Transport System, the Internet's longest-serving PC e-mail system!
Welcome to Pegasus Mail & Mercury Sign in | Join | Help
in
Home Blogs Forums Downloads Pegasus Mail Overview Mercury Overview

100,000 messages in folder- indexed ?

Last post 11-08-2008, 18:19 by hostek. 5 replies.
Sort Posts: Previous Next
  •  08-13-2008, 16:38

    100,000 messages in folder- indexed ?

    Greetings-

    I do a lot of technical research world wide and need to keep received email messages in separate

    folders by subject etc.  I have been using for 7 years a database driven program that has indexs

    on sender, subject, date, size, priortiy  and read or not etc.

    These indexes let me open a large folder with a one second load time as only

    the indexes need to load, and also very sophisticated searches are allowed

    by combining searchs on the different indexes such as

    all messages from a sender that contains specific character string, between

    two dates, and that the subject contains specific string and does NOT contain

    certain character string.  This was the Netscape Communicator of version 4.0 2003.

    I hope to not have to change email systems again for another 7 years and was looking

    for a similar system at least as capable given the super faster computers these

    days.  Can Pegasus help me out here and if not can anyone recommend

    anything ?  The later Netscape program and Thunderbird  were downsized

    in capability and I do not know why they would do that.

    Thanks Sincerely

    Scott in Atlanta, Georgia USA

     

     

  •  08-13-2008, 17:06

    Re: 100,000 messages in folder- indexed ?

    The Pegasus Mail foldering system currently cannot do what you are asking.  You certainly would not be able to to do this with a folder containing 10OK messages. 

    That said, there are third party applications out there that can extract mail from folders and then store it in a database for searching.  The program Aid4Mail Pro can extract mail to MHT Web Archive files (*.mht) - linked from an HTML index page or an MS Excel Workbook file.  This probably could do what you are looking for and it's not specific to any mail program. http://www.aid4mail.com/specifications.php

     


    Thomas R. Stephenson
    San Jose, California
    Member of Pegasus Mail Support Team
  •  08-13-2008, 17:45

    Re: 100,000 messages in folder- indexed ?

    Thank you Thomas. You seem to be pretty technically adept re how the program works.

    Your mention of other programs that can load messages into a database... does

    that mean Pegasus does not use a database engine?  Could you explain

    what sort orders are offered when viewing messages ?   That will give me an

    idea of the indexes being maintained for the message files.  

     I have read about the PMI files and figured them to be index files.

    What is the format of the message files?  Are all the messages kept in

    one file per folder or are all messages kept as separate message files

    much like image files ?

    Boy, I'd really like to find an Email database program all integrated

    together  as I really hate to give up functionality I've had for 7 years.

    I'll tell you this, once you've had it you would easily give a hundred dollars

    to keep it and a thousand users would add up to a hundred thousand dollars.

    Acually I'd be willing to pay a lot more than a hundred dollars to solve my problem.

    Thanks

  •  08-13-2008, 18:34

    Re: 100,000 messages in folder- indexed ?

    Your mention of other programs that can load messages into a database... does
    that mean Pegasus does not use a database engine?  Could you explain what sort orders are offered when viewing messages ?  

    The program uses a folder/index where the folder (PMM file) is just the RFC 2822 messages in a file and the index (PMI file) is what it displayed in the folder listing.  You are pretty much limited to 64K messages max any any one folder.  You are also limited to 2 GBytes per folder.   Your sort orters available are From:, Subject:, Date: and Size: with a threaded alternative based on Subject: and Date: 

    What is the format of the message files?  Are all the messages kept in one file per folder or are all messages kept as separate message files much like image files ?

    This is the info from the old admin manual and it's a bit dated but generally correct. 

    Folders:

    A Pegasus Mail 2.2 folder consists of two files. The master file has the extension .PMM: it has a 128-byte header of which the first 50 bytes are the long name of the folder, the remainder being reserved. Following the header is the text of all the messages in the folder, separated by ^Z characters (ASCII 26). Some of the messages stored in the folder may in fact be deleted  there is no way of determining this without consulting the record matching the message in the .PMI index file for the folder.

    The other file has the extension .PMI, and consists of a
    representation of the message using a structure called an IMESSAGE, shown below in its C language definition (note that all integer values in the file are stored in Intel word order  Pegasus Mail for the Macintosh does whatever conversion is necessary for its Motorola processor as it loads each entry from the index):

    typedef struct
    {
    unsigned long flags;
    unsigned long fpos; /* Offset in master file */
    WORD msg_number; /* Msg ordinal position */
    char fname; /* Unused in folders */
    char from [30]; /* Sender of message */
    char subject [36]; /* Guess what this is */
    char date [20]; /* Reduced form of date */
    long mtime; /* See below */
    long fsize; /* Bytes in this message */
    } IMESSAGE;

    flags is a bitmap of message characteristics, using the following values:

    0x1 The message has Pegasus Mail-style attachments
    0x2 The message is a uuencoded file transmission
    0x4 The message is encrypted
    0x80 The message has been read
    0x2000 Sender requests confirmation of reading
    0x20000L The message is a copy to self
    0x40000L The message has been deleted
    0x80000L The message is in a MIME transmission format
    0x100000L A reply has been sent for this message
    0x200000L The message has been forwarded to another user
    0x400000L The message is urgent (never seen in folders)
    0x800000L The message contains BinHex-encoded enclosures
    0x1000000L The message originates from an MHS system
    0x2000000L The message originates from an SMTP system.
    0x4000000L The message has annotations.
    0x8000000L The message contains enclosures

    All other values are reserved - do not use them.

    The mtime field is a crude calculation of the number of seconds since Jan 1 1990, used only for sorting purposes. It is not intended to be accurate merely "near enough".

    The fsize field can be larger than the actual size of the message, but in no circumstance should it be less - this will cause Pegasus Mail to crash.

     

    Boy, I'd really like to find an Email database program all integrated together  as I really hate to give up functionality I've had for 7 years.
    I'll tell you this, once you've had it you would easily give a hundred dollars to keep it and a thousand users would add up to a hundred thousand dollars.
    Acually I'd be willing to pay a lot more than a hundred dollars to solve my problem.

    You might, most would not.  ;-(  Most people will not pay anything for an e-mail program no matter what bells and whistles it has.  In the corporate world it's generally Outlook and Exchange or Lotus Notes and for the single user whatever is on their system when delivered.  

     


    Thomas R. Stephenson
    San Jose, California
    Member of Pegasus Mail Support Team
  •  08-16-2008, 0:10

    • CobraA1 is not online. Last active: 12-11-2008, 22:51 CobraA1
    • Top 50 Contributor
    • Joined on 05-08-2007
    • Member
    • Points 535

    Re: 100,000 messages in folder- indexed ?

    I think these days people are moving more towards integration of email with calendars, contacts, and TODO lists rather than integration with databases.

    The only thing I can think of right now that can handle that amount of data is gmail, but that's online, so you don't have direct access to the database and won't have the level of control you want.

  •  11-08-2008, 18:19

    • hostek is not online. Last active: 11-09-2008, 19:16 hostek
    • Not Ranked
    • Joined on 11-08-2008
    • Member
    • Points 5

    Re: 100,000 messages in folder- indexed ?

    I had this happen to me too.  However, the information provided in this post gave me an idea of how to fix the problem.  I had one of my programmers write a few lines of code to get my mail folder useable again.  I'll paste the code below in case it will help others with this.  It did work for me!! 

    He created a bash file named pmail on one of our Linux servers.  I'll give the instructions first, and then paste the code at the bottom.  Here's the command line to run the script based on how it's written:

    ./pmail foldername.pmm 3000

    For example, if you wanted to extract 3000 email from the folder named FOL06A16.PMM  you would use the following command:

    ./pmail FOL06A16.PMM 3000

    The script will output 2 files. One called new.FOL06A16.PMM

    (new.FOL06A16.PMM) will exclude the last 3000 email in this case.

    The second file will be named "lemails.PMM" which will include the last 3000 email in this case.

    First, exist Pegasus before you start any of these procedures. 

    Next move the foldername.pmm (ie, FOL06A16.PMM) out of your Pegasus mail directory.  You should have it as a backup and it must NOT be in this directory to accomplish the fix.

    Next, move the lemails.PMM file into your Pegasus mail directory (usually C:\PMAIL\MAIL\[USER] )

    Next Open up Pegasus, and you will see a new "empty" folder. Right click on it and click on "Reindex folder".  Then click on the "empty" folder and click rename on top icon row to rename the folder to what you wish.

    Now Close Pegasus and copy in the new.foldername.pmm (ie, new.FOL06A16.PMM) and rename it to its original name like FOL06A16.PMM. 

    Now Open Pegasus and you will see another "empty" folder.  Right click on it and click on "Reindex folder".  Then click on the "empty" folder and click rename on top icon row to rename the folder to what you wish (note:  Do not name it the same as the other folder above).

    Now Close Pegasus and then Open Pegasus.  You should be all set now.

    Note:  Don't delete you're original folder file in case there are any errors.

    -------------------------------------------------------------------------------------------------------------------------

    Code below:

    -------------------------------------------------------------------------------------------------------------------------

    #!/bin/bash

    #convert new lines to" ~#@~" characters so that all emails will be on a single line

    cat $1 | tr "\n" "~#@~" > newpmail

    #convert all control z chars to new line

    cat newpmail | tr "\32" "\n" > newpmail2

    #count the number of emails (lines)

    filwc=`wc -l newpmail2 | cut -d" " -f1`

    echo $filwc

    origfile=`expr $filwc - $2`

    echo $origfile

    head -$origfile newpmail2 > new

    cat new | tr "\n" "\32" > new.1

    cat new.1 | tr "~#@~" "\n" > new.$1

     

    #put last emails into plast file

    tail -$2 newpmail2 > plast$2

     

    #convert new lines back to ctrl z chars

    cat plast$2 | tr "\n" "\32" > plast$2.1

    #convert ~#@~ back to new lines

    cat plast$2.1 | tr "~#@~" "\n" > lemails.PMM

    #remove temp files

    rm newpmail newpmail2 new new.1 plast$2 plast$2.1

    -------------------------------------------------------------------------------------------------------------------------

    Save the file as pmail

    Make sure the execute permissions on the file are set (chmod 755 pmail)

    Run the script: ./pmail [filename] [number of emails]

    -------------------------------------------------------------------------------------------------------------------------

    I hope this will help others, as I know how important it is to be able to be able to access past email.

    Code written by  Max Maksimov.  If the code doesn't post correctly here, let me know and I will send the details to you. 

     

    Brian A 

     

View as RSS news feed in XML

Copyright © 2007 David Harris / Peter Strömblad. All Rights Reserved. | Terms of Use | Privacy Statement
Questions/Problems with community.pmail.com? | Visit our Hoster: PraktIT | Pegasus Mail Home Page