PDF inventory software

Tue Jun 9 12:42:03 UTC 2009

Le 8 juin 09 à 23:17, Daniel Underwood a écrit :

> I'm looking for a way to manage my personal collection of research
> articles.  Ideally I'd like some way to keep records on authors,
> keywords, journals, and publication years of articles (PDF files)
> downloaded onto my local drive.

Hi Daniel,

I am also a researcher, and although I did not find any tool suited to  
the management of my article's collection, I elaborated a methodology  
I am rather happy with. Let me detail this methodology:

The atom in the organization of my collection of articles is the  
directory, this is handy because in a directory you can store many  
additional information along with the main file (the file containing  
the article).

Each of these folders is stored in a vault. I choose the name vault  
because, IIRC, the place a dragon uses to store its treasures is  
called the «dragon vault» in the relevant literature. We, gathering  
all these articles we do not have time to read, are pretty much like  
these dragons sleeping on their pile.

Here is the procedure to add an article to the collection:

1. I cd to the `vault'
2. I create a new folder to hold the article, usually with a rather  
cryptic name (without accents nor spaces) obtained
3. I cd to this new folder
4. I copy the article under the name `paper.pdf' or `paper.djvu'
5. I create a text file called INDEX, looking much like an email  
envelope, detailing the name of the authors and the article's title

During the life of the article in my collection, I will usually add a  
`mathscinet.bib' for the bibliography entry (when it is taken from  
mathscinet), I may add reviews of the article and text dumps (all of  
this with standardized names).

With this organization, it is pretty easy to dig the collection with  
combinations of `find', `awk', and `grep'. Moreover, putting a  
document in its folder makes the collection very flexible. I have even  
written a program producing a big `index.html' file from all of this,  
but of course it is currently broken and I have no time to fix it (I  
shall soon defend by phD!). There is much more to do, to have the good  
tools managing this collection!

BTW, `djvu' is an alternative format to store articles digitally, it  
has many qualities, among them djvu files are usually much smaller  
than the corresponding PDF files (for retrodigitized papers). See  
djvu.org!
-- 
All the best,
Michaël