We need a Tag based File system “yesterday”

Categories: file system

You: Why a tag based file system?
Me: Because all current file systems are broken.
You: What the F are you talking about?
Me: I know you know more about file systems than I do but listen to me with a logical open mind & then conclude.

Current File systems are broken

To understand my reasoning, let us understand what a file system is. It is a system for files. To be precise, an organizational system around how files are stored on disk (yes, I am specifically speaking about disk-based file systems here). Well what is the point of storing if we cannot retrieve it. So, the description should have been “organizational system around how files are stored & retrieved.” In other words, retrieval should be in focus as much as storing is. So kinda 50-50 split.

But, if we actually thinking about it, we store so that we can retrieve later. If we don’t want to retrieve then we wouldn’t be storing in the first place. Then for all practical purposes, retrieval should be given higher weight in the split. In other words, file systems should be designed with retrieval in mind more than storage. But to me, all of the modern file systems focus on how to organize the layout to optimize storing. This is preciously where modern day file systems are broken. On a final note to elaborate my stance, I would like to point out that we store our work files under ‘Work’ folder and personal files under ‘Personal’ folder so that it is easier for us to retrieve. Hence, retrieval is far more important than storage. Q.E.D. πŸ˜‰

How will tag-based file system help?

With tag-based file system, a file will be referenced by its tag(s) and not based on its storage path. In other words, the storage path for a file is unimportant, i.e., file can be stored anywhere!

For real world examples, do you realize that ‘gmail’ uses ‘tag’ approach as against to traditional ‘MS Outlook’ style of ‘folder’ approach to organize your email?! πŸ™‚

Further examples. Suppose you have stored photos of your vacation to Italy in your hard-disk. Currently, in the ‘folder’ based approach it would be under “/home/bob/vacations/italy/pictures” & retrieved with a:

$> ls /home/bob/vacations/italy/pictures

But in a tag based approach, you would reference them with keywords:vacation, pictures, italy. So, if ‘ls’ command was ported to understand tags it would (perhaps) be:

$> ls vacations+italy+pictures

The above is a equivalence use-case, i.e., tag based file system can work do everything folder-based file system can. Now, let us look into the differences. To be precise, what advantages tag-based file systems bring to table that folder-based can’t?

Negation

An use-case: retrieve all pictures from our italy vacation that was not taken in rome!

$> ls vacations+italy+pictures-rome

In the folder-based filesystem, this is not easy if all the pictures are put in the same folder!! So, negation is the simplest & almost the 1st extension of tag-based file system.

Disordered order

An use-case: retrieve all pictures from our italy vacation

$> ls vacations+italy+pictures

or

$> ls italy+vacations+pictures

or

$> ls pictures+vacations+italy

… and so on! #self-explanatory! πŸ™‚

Auto tagging

This is but an obvious next. Imagine the docs are tagged based on the context & content. We will not be needed to worry about properly naming the files or deciding as to where exactly should we place this document. For example, say there are 2 projects (device driver & file system) and their design document needs to be stored. Both file names could be ‘design.doc’ & with auto-tagging feature, additionally & automatically, new tags will be appended: device driver, linux, design, doc for the first and file system, design, doc for the next!

$> ls design+driver
-rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 design-1.3

vs

$> ls design+filesystem
-rwxrwx--- 1 root vboxsf   244147 2014-09-22 14:37 design_v10

No need for file extensions

With auto tagging capability the extensions are but tags! Truly! Who needs file extensions anymore? We can name the file ‘Resume’ and no need of ‘Resume.docx’ or ‘Resume.pdf’ The auto-tagging will automatically tag it as ‘pdf’ or ‘doc’ according to the true type of the file!

$> ls design+driver+pdf
-rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 device_driver_design-1.3

$> ls design+driver+txt
-rwxrwx--- 1 root vboxsf   244147 2014-09-22 14:37 device_driver_design-1.3

Archival

Reduce the clutter of files you see while browsing. Just like emails! Archive old no more relevant files but it can be retrieved whenever we want!

$> ls projectA
-rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 common_components-1.3
-rwxrwx--- 1 root vboxsf   244147 2014-09-22 14:37 Common_components_patch_steps

$> ls archived+projectA
-rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 common_components_detailed_design
-rwxrwx--- 1 root vboxsf   244147 2014-09-22 14:37 Common_components_highlevel_design

External-Archives

This is an extension to the above. We could even connect our hard-disk & say ‘push all the archived data to the external hard-disk’ πŸ™‚

Reduce time spent on file names

If more apps (like email clients) can support tags then the recipient will also have tags (+ auto tags). With auto-tags, imagine if we can tag the pictures being imported based on its GPS data?! Or all emails downloaded from MS Outlook as ‘work’ & all from gmail as ‘personal’! Then the tags provided by author (say your manager) will get appended with our auto-tags (say ‘work’), which means your search could be “manager+pdf+approval” & instead of “project_approval.pdf” In other plain words, don’t worry about naming conventions all files could be named ‘New Document,’ who cares! πŸ™‚

$> ls projectA
-rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 file1
-rwxrwx--- 1 root vboxsf   244147 2014-09-22 14:37 file2
-rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 file3
-rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 file4

$> ls projectA+pdf
-rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 file1
-rwxrwx--- 1 root vboxsf   244147 2014-09-22 14:37 file2

$> ls projectA+pdf+manager
-rwxrwx--- 1 root vboxsf 20365048 2014-09-22 14:37 file1

Conclusion

I, obviously, am not a file system expert. But within my capacity of a file system enthusiast I have put forth my points as to why we need a tag-based file system and what we can achieve with the same. The examples & suggestions of extensions are not exhaustive or even close to complete. But with these many advantages glaring the face, why isn’t that we haven’t moved into a tag-based filesystem yet?!!

What Next?

Allow me to be cocky in designing a tag-based filesystem. I shall give a link to the blog here ASAP. πŸ™‚

References

«
»

    Leave a Reply

    Your email address will not be published.