Flowchart

Component chart

  • Overall chart

Upload/Edit flow

Two requests are sent in parallel: add file metadata and upload the file to cloud storage. Both requests originate from client 1.

Upload file metadata

  1. Client app computes the file metadata to be uploaded.

    • File name, file content MD5

    • Number of blocks (assume each block is 4MB) and MD5 values for each block.

  2. Client app sends the metadata to web server.

  3. Web server computes globally unique block IDs for each block.

  4. Web server sends metadata and block IDs to metadata DB change the file upload status to “pending.”

  5. Web server returns the block IDs to client app.

Upload file

  1. Client app request block servers to upload blocks.

  2. Block servers connect to metadata DB to verify permissions.

  3. Block servers verify correctness of MD5 values for each block

  4. Block servers store blocks inside object storage.

Notify other clients (Optional)

  1. Web servers notify notification service that a new file is being added.

  2. The notification service notifies relevant clients (client 2) that a file is being uploaded.

Download flow

  • Download flow is triggered by

    • A file is added or edited elsewhere.

    • User proactively request to sync files

Get notified about updates(Optional)

  1. Notification service informs client app that a file is changed somewhere else.

Fetch metadata

  1. Client app send requests to web servers to download files.

  2. Web servers call metadata DB to fetch metadata of changes and block IDs.

  3. Web servers return block IDs and block servers to client apps.

Fetch files

  1. Client app sends requests to block servers to download blocks.

  2. Block servers fetch metadata from metadata DB and verify permissions.

  3. Block servers fetch blocks from object storage.

  4. Block servers return blocks to client app.

  5. Client app verify MD5 and build the entire file.

Notification flow

  • How does a client know if a file is added or edited by another client?

Notify online and offline client

  • Client online: Notification service will inform client A that changes are made somewhere so it needs to pull the latest data.

  • Client offline: while a file is changed by another client, data will be saved to the cache. When the offline client is online again, it pulls the latest changes.

Long polling vs websockets

  • Even though both options work well, we opt for long polling for the following two reasons:

    • Communication for notification service is not bi-directional. The server sends information about file changes to the client, but not vice versa.

    • WebSocket is suited for real-time bi-directional communication such as a chat app.

Last updated