Wednesday, July 30, 2008

The migraine that is email archiving – input needed

I’m pretty sure that my company is very much like any other companies with out of control email inbox sizes and politics getting in the way of putting the hammer down on restrictions in size of mail files. I’ve been tasked with being involved with email archiving here at the ranch and to be honest with you, it is not a task that I am looking forward to implementing. Being a single Domino Administrator and having another level of complexity put on top of my other daily chores, (like rolling out R8), is cause for me to freak over this a little. With our marching order given, we set off to find a solution that would fit our needs.
But as I have found, our needs are different from other companies. First, we do not support email archiving using the method built into Lotus Domino. Secondly, we do file retention based on when the memo document was last modified and not when the document was created. And lastly, we are not looking for a solution that is not centered on e-discovery. Basically we are looking for something to clean up email files and allow us to continue our file retention. Now if you have looked into this space at all, you might be shaking your head right now. After talking with several vendors, I sure was. Most all solutions on the market are based on archiving by when the document was created. Any time the archive task acts against a document in the mail file, the modification date changes and our retention policy flies right out the window.
We finalized on one solution that could be customized to give us the flexibility to still used modified date by modifying the mail template, (even more then I am comfortable with), so a new field is introduced to act as the modified date for the end user.
So blogsphereians, I ask you. What email archiving solution do you use and why?


Roland Reddekop said...

Perhaps the "powers that be" should revise their doc retention policy with respect to email to something that makes sense.
What I mean is that for email, the created date, not the last modifed date is the appropriate time stamp to use.

Question to ask:
Is there a good reason why the archive cut off is based on last modified date versus created date? The nature of emails are not like word processor or spreadsheet files. Those documents are modified all the time, but emails in general should not be modifed after being received. Of course, the modified date does change due to activities like using the followup flag function, replying or forwarding the email (those forward and reply view icon flags that are stored in the email), or even by custom agents that touch mail documents. But in general, the content of the email should be a snapshot in time of what was sent and therefore the doc created date is the appropriate timestamp for archiving emails.

Perhaps I am just having a mind block and can't think of any justifiable reasons to use the last modified date for archiving email. What would they be?

Andy Donaldson said...

Thanks for the feedback Roland! I've been arguing along those lines for a while, but it comes back to if someone is "using" the document and need to hold onto it longer, then the modification date gives them the chance to hold onto it longer. It makes my head hurt as well.

IdoNotes said...

Keep mail? Are you serious? As in longer than a few days? No one does that madness. How long do they keep mail at home? Users are packrats I say, packrats

Roland Reddekop said...

I feel your pain Andy...

You added: "it comes back to if someone is 'using' the document and need to hold onto it longer, then the modification date gives them the chance to hold onto it longer"

If this is their logic, then they are not getting what they think anyway because just reading an email or even viewing an attachment inside the email won't alter the last modification date (without custom code).

That said, we just use the vanilla Notes archive to server policy for anything over 6 months old. The default filter used by Notes is based on the docs Last Modified date which is fine for our requirements. Currently our plan is for archives to contain 5 years of email and then push it to tape only for years 6&7 then purge it.

We are counting on an implementation of Quickr to offload attachments plus document compression (Notes 8.0.1) plus DAOS (Notes 8.5) to help us achieve this policy over the next 5 years. We shall see.

Richard Schwartz said...

Something's got me confused here.

You wrote: "Any time the archive task acts against a document in the mail file, the modification date changes and our retention policy flies right out the window."

What do you mean "acts against"? Do you mean that the archive product is updating the mod date when it archives a message? Or that it updates the mod date every time it looks at the message?

The big problem with using the mod date for archiving is that it gives users a way to subvert the archiving policy. All they have to do is run through the messages and touch them so that the mod date changes, and poof!... the message is not archived. Your goal of reducing mail file storage flies out the window.

BTW: Have you been following IBM's plans for a feature called "DAOS" in Domino 8.5. It will massively reduce the storage requirement for mail files. No help to you now, of course, and because it is transparent to backup programs it does nothing to reduce your backup load either.

Andy Donaldson said...

@idonotes - I know. I heard you preaching the gospel in Boston ;-)

@Roland - You really hit the nail on the head with stating that email is being used like Word or Excel. It is! You put it into words that I have been finding to say for a long time! Thanks!!

@Richard - The acts against I was thinking of is archiving-unarchiving of the email. Right now for folks to get around file retention, they have to go into an email, put it in edit mode, hit the space bar and save and close. I've also caught people using agents to do the same thing. And yes, I am VERY much looking forward to the enchantments in 8.5 as well as 8.0.1 to help with this.

Denny Russell said...


I'd be happy to help you out with this. I've blogged several articles about Archiving and email retention policies.

Our products are Notes built products and we are/were Domino Administrators at one time and feel your pain.

I hope we can help you out.


Daniel Lieber said...

The various tools for collaboration have evolved and include e-mail, word processors, spreadsheets, and many other software mechanisms. Content of any sort needs to be retained due to the content itself, not the medium or program in which it is created. An active work-in-progress needs to be treated differently than a more static component. Requesting or requiring explicit closure of a topic or using an arbitrary means to identify importance and status is unrealistic as users just do not provide the meta-data status and other flags. The key is to capture information with as minimal disruption to the user (e.g. none!) and as much accuracy as possible while separating the chafe from the meaningful content. Traditional enterprise content management solutions often fail at capturing the desired amount of information from collaborative areas due to their rigidity of process. The solution we designed, Records Manager Express for Domino, takes a completely different approach -- the content is used with rules defined by the organization for complete automatic filing without user intervention.

I'm glad you posted this blog topic as it validates the importance of understanding the difference between the tools and their usage in a collaborative manner.