We need a Tag based File system “yesterday”
You: Why a tag based file system?
Me: Because all current file systems are broken.
You: What the F are you talking about?
Me: I know you know more about file systems than I do but listen to me with a logical open mind & then conclude.
Current File systems are broken
To understand my reasoning, let us understand what a file system is. It is a system for files. To be precise, an organizational system around how files are stored on disk (yes, I am specifically speaking about disk-based file systems here). Well what is the point of storing if we cannot retrieve it. So, the description should have been “organizational system around how files are stored & retrieved.” In other words, retrieval should be in focus as much as storing is. So kinda 50-50 split.
But, if we actually thinking about it, we store so that we can retrieve later. If we don’t want to retrieve then we wouldn’t be storing in the first place. Then for all practical purposes, retrieval should be given higher weight in the split. In other words, file systems should be designed with retrieval in mind more than storage. But to me, all of the modern file systems focus on how to organize the layout to optimize storing. This is preciously where modern day file systems are broken. On a final note to elaborate my stance, I would like to point out that we store our work files under ‘Work’ folder and personal files under ‘Personal’ folder so that it is easier for us to retrieve. Hence, retrieval is far more important than storage. Q.E.D. π
How will tag-based file system help?
With tag-based file system, a file will be referenced by its tag(s) and not based on its storage path. In other words, the storage path for a file is unimportant, i.e., file can be stored anywhere!
For real world examples, do you realize that ‘gmail’ uses ‘tag’ approach as against to traditional ‘MS Outlook’ style of ‘folder’ approach to organize your email?! π
Further examples. Suppose you have stored photos of your vacation to Italy in your hard-disk. Currently, in the ‘folder’ based approach it would be under “/home/bob/vacations/italy/pictures” & retrieved with a:
$> ls /home/bob/vacations/italy/pictures
But in a tag based approach, you would reference them with keywords:vacation, pictures, italy. So, if ‘ls’ command was ported to understand tags it would (perhaps) be:
$> ls vacations+italy+pictures
The above is a equivalence use-case, i.e., tag based file system can work do everything folder-based file system can. Now, let us look into the differences. To be precise, what advantages tag-based file systems bring to table that folder-based can’t?
Negation
An use-case: retrieve all pictures from our italy vacation that was not taken in rome!
$> ls vacations+italy+pictures-rome
In the folder-based filesystem, this is not easy if all the pictures are put in the same folder!! So, negation is the simplest & almost the 1st extension of tag-based file system.
Disordered order
An use-case: retrieve all pictures from our italy vacation
$> ls vacations+italy+pictures
or
$> ls italy+vacations+pictures
or
$> ls pictures+vacations+italy
… and so on! #self-explanatory! π
Auto tagging
This is but an obvious next. Imagine the docs are tagged based on the context & content. We will not be needed to worry about properly naming the files or deciding as to where exactly should we place this document. For example, say there are 2 projects (device driver & file system) and their design document needs to be stored. Both file names could be ‘design.doc’ & with auto-tagging feature, additionally & automatically, new tags will be appended: device driver, linux, design, doc for the first and file system, design, doc for the next!
$> ls design+driver -rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 design-1.3
vs
$> ls design+filesystem -rwxrwx--- 1 root vboxsf 244147 2014-09-22 14:37 design_v10
No need for file extensions
With auto tagging capability the extensions are but tags! Truly! Who needs file extensions anymore? We can name the file ‘Resume’ and no need of ‘Resume.docx’ or ‘Resume.pdf’ The auto-tagging will automatically tag it as ‘pdf’ or ‘doc’ according to the true type of the file!
$> ls design+driver+pdf -rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 device_driver_design-1.3 $> ls design+driver+txt -rwxrwx--- 1 root vboxsf 244147 2014-09-22 14:37 device_driver_design-1.3
Archival
Reduce the clutter of files you see while browsing. Just like emails! Archive old no more relevant files but it can be retrieved whenever we want!
$> ls projectA -rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 common_components-1.3 -rwxrwx--- 1 root vboxsf 244147 2014-09-22 14:37 Common_components_patch_steps $> ls archived+projectA -rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 common_components_detailed_design -rwxrwx--- 1 root vboxsf 244147 2014-09-22 14:37 Common_components_highlevel_design
External-Archives
This is an extension to the above. We could even connect our hard-disk & say ‘push all the archived data to the external hard-disk’ π
Reduce time spent on file names
If more apps (like email clients) can support tags then the recipient will also have tags (+ auto tags). With auto-tags, imagine if we can tag the pictures being imported based on its GPS data?! Or all emails downloaded from MS Outlook as ‘work’ & all from gmail as ‘personal’! Then the tags provided by author (say your manager) will get appended with our auto-tags (say ‘work’), which means your search could be “manager+pdf+approval” & instead of “project_approval.pdf” In other plain words, don’t worry about naming conventions all files could be named ‘New Document,’ who cares! π
$> ls projectA -rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 file1 -rwxrwx--- 1 root vboxsf 244147 2014-09-22 14:37 file2 -rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 file3 -rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 file4 $> ls projectA+pdf -rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 file1 -rwxrwx--- 1 root vboxsf 244147 2014-09-22 14:37 file2 $> ls projectA+pdf+manager -rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 file1
Conclusion
I, obviously, am not a file system expert. But within my capacity of a file system enthusiast I have put forth my points as to why we need a tag-based file system and what we can achieve with the same. The examples & suggestions of extensions are not exhaustive or even close to complete. But with these many advantages glaring the face, why isn’t that we haven’t moved into a tag-based filesystem yet?!!
What Next?
Allow me to be cocky in designing a tag-based filesystem. I shall give a link to the blog here ASAP. π
References
- FileSystem that uses tags rather than folders?
- Tag based vs hierarchical file system structure
- dhtfs
- Semantic file system
- Tag based file system
- tagfilesystem(TagFs)
- Non-hierarchical file system (tags+attributes?)
- Integration of a tag based file system approach in Gnome and Nautilus
- Struktur FileSystem
Leave a Reply