Skip to main content
Best Practice:

File naming convention for document filing

Summary

Description of a quick-to-learn, tried-and-tested storage and naming convention for storing invoices, insurance policies, letters, operating instructions, etc. in a flat hierarchy. Suitable for any operating system.

Introduction #

At the beginning of 2010, I started digitizing my office documents, book collection, operating instructions, letters, postcards, etc. and storing them digitally.

To do this, I came up with a naming convention and adapted and expanded it slightly over the years. My filing system and naming convention have been working well for 15 years now and have helped me in the most absurd situations during this time.

Over the past 4 years, I have advised friends, family, and even companies on how to sensibly proceed with naming documents for digital storage. Some of these people, who have completely adopted my system, rave about my approach. That’s how the idea for this article came about. I would like to explain to you here and show you with many examples how you can sensibly store documents with a suitable naming convention.

File storage convention #

Let me start with the File Storage Location Convention. This one will probably surprise you.

All documents end up in a single folder.

In my case, this folder is simply called “box”: ~/Documents/box/. There are no subfolders.

Advantages #

  • Avoid duplicates
    No two files can have the same name.
  • Easy to find files
    No more guessing, all files are in the same location.
  • Easy to sort
    Modern file browsers allow to quickly sort and find files.

This system has proven itself. Even when used with, e.g. Dropbox, Google Drive, etc.

Possible limitations #

As a freelancer and employee, I have saved around 4,000 files (one to multi-page documents) using this system over the last 15 years. That corresponds to around 270 documents per year. By far, the largest file in my “box” takes up 122 Megabytes. It is a 372 pages strong PDF file.

As long as you deal with such manageable amounts of documents, which also have a reasonable size, you shouldn’t have any problems with modern file systems.

Amount of files per folder #

  • Linux & Android
    The ext4 file system allows directories to contain up to 4,000,000,000 files
  • Apple
    The apfs file system allows directories to contain up to 9,223,372,036,854,775,808 files
  • Windows
    The ntfs file system allows directories to contain up to 4,294,967,295 files

My file storage convention should be suitable for families, freelancers, small and medium-sized companies that have a "reasonable" volume of documents. Large companies should work with an extra layer of folder arrangements, for example for years (/box/2024/) or decades (/box/202*/).

File naming convention #

And now the juicy part. The actual naming convention by which I store my documents.

Parts of a file name #

Each file name may consist of the following pieces of information:

  • DATE
    Document creation date YYYY-MM-DD.
  • TYPE
    letter, contract, certificate, etc. (predefined)
  • CREATOR
    Creator/issuer/sender of the document
  • RECIPIENT
    Recipient of the document (optional)
  • DESCRIPTION
    May include subject, invoice numbers, order numbers, customer numbers, etc.
  • FILE EXTENSION
    File type extension, e.g. .pdf, .txt, etc.

Writing convention #

All parts of a file name are …

  • chained together using underscores
  • written so that spaces are replaced by dashes (-)
  • written in all lowercase UTF-8 character set
  • descriptive

DATE_TYPE_CREATOR_RECIPIENT_DESCRIPTION.FILE-EXTENSION

Document TYPE #

For the TYPE, I’m using a small set of predefined values:

  • letter
    For all general communication documents.
  • contract
    For, e.g. rental agreements, employment contracts, mandates, etc.
  • certificate
    For, e.g. Birth certificate, school certificates, etc.
  • invoice
    For all invoice documents.
  • receipt
    For all receipt documents, e.g. cash receipts.
  • offer
    For written offers received and made.
  • coo
    For order confirmations.
  • manual
    Instruction manuals, assembly instructions, etc.
  • article
    Newspaper articles, blog articles, publications, etc.
  • clippings
    Clippings documents created from, e.g. ebooks, articles, etc.
  • diary
    Daily notes, diary documents.
  • document
    For documents that do not match any other specific TYPE.

Document RECIPIENT #

The RECIPIENT can be optional. If there is no RECIPIENT, leave the RECIPIENT out. For example, there is no RECIPIENT when storing an article from absurdistantimes.com about the US election results.

2024-11-06_article_absurdistan-times.com_us-election-results-2024

The same is true for a diary entry for the same day.

2024-11-06_john-doe_diary.txt

On the other hand, there is almost always a creator/issuer/sender for a document. Do not leave out the CREATOR.

Examples #

I will only list a few simple examples here.

2020-02-02_contract_gongshow-inc_john-doe_employment-contract.pdf
2024-02-28_letter_ellexis-snoop_john-doe_birthday-card.jpg

For a detailed list with more examples and real files, which you can download, visit the Naming Convention repository on GitHub.

Once you got the files, you can play around with them in your file browser. For example try to …

  • find all files from May 2024 (use Ctrl+F to search for 2024-05)
  • list all diary entries of 2024 (use Ctrl+F to search for diary)

You will see how easy an intuitive it is to identify, sort, filter and find files.

Advantages #

Using the naming convention along with the storage convention has several advantages:

  • Operating system agnostic
    Using the naming convention and storage with all operating systems, like Unix, GNU/Linux, Apple, Windows, Android, etc.
  • Software agnostic
    Using the naming convention and storage does not require any additional software.
  • Easy to back up
    Using the naming convention and storage works with all existing back up solutions.
  • Helps to avoid duplicates
    Using the naming convention and storage instructions actively prevents duplicates. Even if they do occur, they are easy to identify.
  • Easy to find files
    Using search, filter, and sorting functions in modern file browsers, files can be found in seconds. As long as the rules are applied consistently.

Possible limitations #

Amount of characters per file name #

All file systems support up to 255 character file names. The actual length depends on how many bytes a characters takes. There are some multibyte characters in Unicode.

  • Linux & Android
    The ext4 file system allows file names to contain up to 255 characters
  • Apple
    The apfs file system allows file names to contain up to 255 characters
  • Windows
    The ntfs file system allows file names to contain up to 255 characters

The longest document name in my ~/Documents/box/ directory is 204 characters long. The second longest is 158 characters long. I have not managed to run into limitations.

Standardizations #

Character conversion #

When creating file names, be sure to convert the following characters. This will make it easier to search for files on the largest possible number of international keyboards.

CharConversion
äae
öoe
üue
ßss
äae
æae
œoe
øoe
åaa
ðeth
šs
žz
õoe
ëe
αalpha
$dollar
euro
Illegal characters #

Avoid using the following special characters in file names. Apply character conversion or leave them out completely.

Example:

ExampleConversion
ACME, Inc.acme-inc
  • . perdiod (only use to separate file name and extension)
  • , comma
  • # pound
  • % percent
  • { left curly bracket
  • } right curly bracket
  • \ back slash
  • < left angle bracket
  • > right angle bracket
  • * asterisk
  • ? question mark
  • / forward slash
  • ` ` blank spaces
  • $ dollar sign
  • euro sign
  • ! exclamation point
  • " double quotes
  • double quotes
  • double quotes
  • : colon
  • @ at sign
  • + plus sign
  • ` backtick
  • | pipe
  • = equal sign
  • emojis

Further readings #

Sources and recommended, further resources on the topic:

Author

Jonas Jared Jacek • J15k

Jonas Jared Jacek (J15k)

Jonas works as project manager, web designer, and web developer since 2001. On top of that, he is a Linux system administrator with a broad interest in things related to programming, architecture, and design. See: https://www.j15k.com/

License

File naming convention for document filing by Jonas Jared Jacek is licensed under CC BY-SA 4.0.

This license requires that reusers give credit to the creator. It allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, for noncommercial purposes only. To give credit, provide a link back to the original source, the author, and the license e.g. like this:

<p xmlns:cc="http://creativecommons.org/ns#" xmlns:dct="http://purl.org/dc/terms/"><a property="dct:title" rel="cc:attributionURL" href="https://www.ditig.com/document-filing-naming-convention">File naming convention for document filing</a> by <a rel="cc:attributionURL dct:creator" property="cc:attributionName" href="https://www.j15k.com/">Jonas Jared Jacek</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" rel="license noopener noreferrer">CC BY-SA 4.0</a>.</p>

For more information see the Ditig legal page.

All Topics

Random Quote

“Perfection is finally attained not when there is no longer anything to add, but when there is no longer anything to take away.”

 Antoine de Saint-Exupery French writer, poet, and journalistWind, Sand and Stars, - IT quotes