All Projects

Distributed File System

C++gRPCProtocol Buffers
Role(s)
Solo Developer
Context
Graduate Course Project
Year
2024

Overview

As the final project of my Graduate Operating Systems course, I created a multithreaded distributed file system based on the Andrew File System. The system is complete with chunk-based file transfer protocols handled using gRPC and Protocol Buffers and a real-time synchronization system between a server and multiple clients.

Notable Features

Real-Time File Synchronization

Data between client(s) and server is automatically synchronized through a two-channel architecture. The watcher channel monitors the local client directory using inotify for file create, modify, and delete events. Likewise, the CallbackList RPC channel reads from the server and writes to clients. To prevent write conflicts, clients must acquire a distributed write lock from the server before uploading or deleting.

GIF showing automatic file synchronization

Figure 1: Automatic file synchronization using inotify and CallbackList RPCs

Deletion propagation is handled through tombstoning. When a client deletes a file, the server records a tombstone containing the file’s metadata. By doing this, “file resurrections” are prevented as clients can compare their local files against the tombstone list to determine if a file should be uploaded to the server or removed locally.

Chunk-Based File Transfer

To transfer larger files, the distributed file system uses streaming RPCs to send data in fixed-size chunks. As such, when storing files, the client will continuously read chunks of a file into a buffer, add the file chunk to a StoreRequest and send the StoreRequest to the server. Likewise the server will then follow the same process, but will instead open an ofstream buffer to store the file as it is received and write it to disk.

GIF showing chunk-based file transfer

Figure 2: Chunk-based file transfer using streaming RPCs. GIF is slowed to .25x speed for clarity.

To ensure that existing files are not sent to the server, the client compares a checksum of a local file against the server’s version, if applicable. If the checksums match, the transfer is skipped entirely.

Retrospective

This project was my first experience building a distributed system from the ground up. While I had previously worked with client-server architectures, this project forced me to consider how synchronization, consistency, and fault tolerance are handled in a distributed environment. That said, while my implementation was functional and fulfilled the course requirements, there are several areas I would improve upon if given more time:

  • Create a "janitor" function to clean up tombstones and stale locks to prevent unbounded growth of metadata.
  • Implement more robust error handling and retries for network operations to improve resilience.
  • Add unit and integration tests to ensure correctness of file operations and synchronization logic.
  • Look into more efficient file watching mechanisms to reduce resource usage on clients.