File naming convention for document filing
Summary
Description of a quick-to-learn, tried-and-tested storage and naming convention for storing invoices, insurance policies, letters, operating instructions, etc. in a flat hierarchy. Suitable for any operating system.
Introduction #
At the beginning of 2010, I started digitizing my office documents, book collection, operating instructions, letters, postcards, etc. and storing them digitally.
To do this, I came up with a naming convention and adapted and expanded it slightly over the years. My filing system and naming convention have been working well for 15 years now and have helped me in the most absurd situations during this time.
Over the past 4 years, I have advised friends, family, and even companies on how to sensibly proceed with naming documents for digital storage. Some of these people, who have completely adopted my system, rave about my approach. That’s how the idea for this article came about. I would like to explain to you here and show you with many examples how you can sensibly store documents with a suitable naming convention.
File storage convention #
Let me start with the File Storage Location Convention. This one will probably surprise you.
All documents end up in a single folder.
In my case, this folder is simply called “box”: ~/Documents/box/. There are no subfolders.
Advantages #
- Avoid duplicates
No two files can have the same name. - Easy to find files
No more guessing, all files are in the same location. - Easy to sort
Modern file browsers allow to quickly sort and find files.
This system has proven itself. Even when used with, e.g. Dropbox, Google Drive, etc.
Possible limitations #
As a freelancer and employee, I have saved around 4,000 files (one to multi-page documents) using this system over the last 15 years. That corresponds to around 270 documents per year. By far, the largest file in my “box” takes up 122 Megabytes. It is a 372 pages strong PDF file.
As long as you deal with such manageable amounts of documents, which also have a reasonable size, you shouldn’t have any problems with modern file systems.
Amount of files per folder #
- Linux & Android
Theext4file system allows directories to contain up to4,000,000,000files - Apple
Theapfsfile system allows directories to contain up to9,223,372,036,854,775,808files - Windows
Thentfsfile system allows directories to contain up to4,294,967,295files
My file storage convention should be suitable for families, freelancers, small and medium-sized companies that have a "reasonable" volume of documents. Large companies should work with an extra layer of folder arrangements, for example for years (/box/2024/) or decades (/box/202*/).
File naming convention #
And now the juicy part. The actual naming convention by which I store my documents.
Parts of a file name #
Each file name may consist of the following pieces of information:
- DATE
Document creation dateYYYY-MM-DD. - TYPE
letter,contract,certificate, etc. (predefined) - CREATOR
Creator/issuer/sender of the document - RECIPIENT
Recipient of the document (optional) - DESCRIPTION
May include subject, invoice numbers, order numbers, customer numbers, etc. - FILE EXTENSION
File type extension, e.g..pdf,.txt, etc.
Writing convention #
All parts of a file name are …
- chained together using underscores
- written so that spaces are replaced by dashes (
-) - written in all lowercase UTF-8 character set
- descriptive
DATE_TYPE_CREATOR_RECIPIENT_DESCRIPTION.FILE-EXTENSION
Document TYPE #
For the TYPE, I’m using a small set of predefined values:
letter
For all general communication documents.contract
For, e.g. rental agreements, employment contracts, mandates, etc.certificate
For, e.g. Birth certificate, school certificates, etc.invoice
For all invoice documents.receipt
For all receipt documents, e.g. cash receipts.offer
For written offers received and made.coo
For order confirmations.manual
Instruction manuals, assembly instructions, etc.article
Newspaper articles, blog articles, publications, etc.clippings
Clippings documents created from, e.g. ebooks, articles, etc.diary
Daily notes, diary documents.document
For documents that do not match any other specific TYPE.
Document RECIPIENT #
The RECIPIENT can be optional. If there is no RECIPIENT, leave the RECIPIENT out. For example, there is no RECIPIENT when storing an article from absurdistantimes.com about the US election results.
2024-11-06_article_absurdistan-times.com_us-election-results-2024
The same is true for a diary entry for the same day.
2024-11-06_john-doe_diary.txt
On the other hand, there is almost always a creator/issuer/sender for a document. Do not leave out the CREATOR.
Examples #
I will only list a few simple examples here.
2020-02-02_contract_gongshow-inc_john-doe_employment-contract.pdf
2024-02-28_letter_ellexis-snoop_john-doe_birthday-card.jpg
For a detailed list with more examples and real files, which you can download, visit the Naming Convention repository on GitHub.
Once you got the files, you can play around with them in your file browser. For example try to …
- find all files from May 2024 (use
Ctrl+Fto search for2024-05) - list all diary entries of 2024 (use
Ctrl+Fto search fordiary)
You will see how easy an intuitive it is to identify, sort, filter and find files.
Advantages #
Using the naming convention along with the storage convention has several advantages:
- Operating system agnostic
Using the naming convention and storage with all operating systems, like Unix, GNU/Linux, Apple, Windows, Android, etc. - Software agnostic
Using the naming convention and storage does not require any additional software. - Easy to back up
Using the naming convention and storage works with all existing back up solutions. - Helps to avoid duplicates
Using the naming convention and storage instructions actively prevents duplicates. Even if they do occur, they are easy to identify. - Easy to find files
Using search, filter, and sorting functions in modern file browsers, files can be found in seconds. As long as the rules are applied consistently.
Possible limitations #
Amount of characters per file name #
All file systems support up to 255 character file names. The actual length depends on how many bytes a characters takes. There are some multibyte characters in Unicode.
- Linux & Android
Theext4file system allows file names to contain up to255characters - Apple
Theapfsfile system allows file names to contain up to255characters - Windows
Thentfsfile system allows file names to contain up to255characters
The longest document name in my ~/Documents/box/ directory is 204 characters long. The second longest is 158 characters long. I have not managed to run into limitations.
Standardizations #
Character conversion #
When creating file names, be sure to convert the following characters. This will make it easier to search for files on the largest possible number of international keyboards.
| Char | Conversion |
|---|---|
| ä | ae |
| ö | oe |
| ü | ue |
| ß | ss |
| ä | ae |
| æ | ae |
| œ | oe |
| ø | oe |
| å | aa |
| ð | eth |
| š | s |
| ž | z |
| õ | oe |
| ë | e |
| α | alpha |
| $ | dollar |
| € | euro |
Illegal characters #
Avoid using the following special characters in file names. Apply character conversion or leave them out completely.
Example:
| Example | Conversion |
|---|---|
| ACME, Inc. | acme-inc |
.perdiod (only use to separate file name and extension),comma#pound%percent{left curly bracket}right curly bracket\back slash<left angle bracket>right angle bracket*asterisk?question mark/forward slash- ` ` blank spaces
$dollar sign€euro sign!exclamation point"double quotes“double quotes„double quotes:colon@at sign+plus sign`backtick|pipe=equal sign- emojis
Further readings #
Sources and recommended, further resources on the topic:
License
File naming convention for document filing by Jonas Jared Jacek is licensed under CC BY-SA 4.0.
This license requires that reusers give credit to the creator. It allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, for noncommercial purposes only. To give credit, provide a link back to the original source, the author, and the license e.g. like this:
<p xmlns:cc="http://creativecommons.org/ns#" xmlns:dct="http://purl.org/dc/terms/"><a property="dct:title" rel="cc:attributionURL" href="https://www.ditig.com/document-filing-naming-convention">File naming convention for document filing</a> by <a rel="cc:attributionURL dct:creator" property="cc:attributionName" href="https://www.j15k.com/">Jonas Jared Jacek</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" rel="license noopener noreferrer">CC BY-SA 4.0</a>.</p>For more information see the Ditig legal page.