Inspiration

For storing a large capacity of data onto the Filecoin network, especially the PiB-scale data, it's unavoidable to use the IPFS technology stack and build related tools, services, and infrastructure. The original IPFS implementation can solve the problem of how individual users can use IPFS for content distribution and data sharing. However, the data capacity was relatively small, and there are still many challenges to storing and distributing massive data using IPFS.

What it does

DS-Cluster provides customers and storage providers with technical solutions that support data management, preprocessing, transmission and provide users with efficient data retrieval and availability.

How we built it

We organized files or objects in a Merkle-DAG structure so that parts of multiple files or objects can share some data blocks, thereby reducing data redundancy and saving bandwidth for data transmission over the network.

Challenges we ran into

Data management will become more complex and meet some foreseeable challenges, including:

  • The management module of files or objects needs to be abstracted based on Merkle-DAG.
  • It's nearly impossible to delete files or objects directly, only through garbage collection to release data blocks that don't need anymore.

Accomplishments that we're proud of

DS-Cluster can be considered as the storage layer of FileDrive's distributed storage services, built on the IPFS technology stack. With DS-Cluster, we can successfully use distributed storage nodes to provide data availability and erasure coding technology to improve data reliability and fault tolerance. At the same time, it can also allow us to pay more attention to data management and the change of storage node clusters.

What we learned

IPFS cluster, Libp2p protocol, and how to match DS-Cluster with the Filecoin network.

What's next for DS-Cluster

  • data sharding and hash slots maintaining
  • communication module based on libp2p between data nodes
  • consensus module build-up
  • hash slots re-allocate and re-balance strategy
  • data migration after hash slots re-allocate or re-balance to support dynamically adding or removing nodes
  • authentication and data management
Share this project:

Updates