The Cloud Wars: from Web2 to The Graph
June 30, 2022
10 min
Being human means remembering: our memories define our identity. Today’s technologies, therefore, can make us infinitely more aware by storing information, ideas or projects for us. Thus, we go out into the world enveloped in ‘clouds‘ of data. This is the principle behind cloud storage, a solution to forgetting. Cloud services have evolved over time and, by integrating with p2p and blockchain, have given rise to several decentralised solutions. In this article, we will first explore what IPFS is, and then the difference between Filecoin and Storj. Finally, we will reach the last stage of innovation, The Graph.
Cloud storage: heavenly data
What is cloud computing? It makes use of the Internet to provide computing power in the form of data processing and transmission. However, ‘cloud storage’ services are the premise of cloud computing: the processing of data is strictly dependent on the systems used to store it.
You can find examples of cloud storage provided by almost every hi-tech company: OneDrive for Microsoft, Amazon S3 (AWS), Google Cloud Platform, of which we all know Google Drive, but also Dropbox are well-known solutions for online cloud storage.
All alternatives provide, basically, the same potential, but what do they really have in common? Being cloud storage services provided by private parties, who own the entire infrastructure, they are centralised.
Excluding value judgements, centralisation has controversial aspects: it is essential to save photos, documents and contacts, but is our privacy equally preserved? By entrusting the security of important files to third parties, we give up the secrecy of sensitive information.
Moreover, files in centralised clouds could be at risk: a malfunction, a natural disaster or a breach of security measures could lead to prolonged inaccessibility or irreversible loss of data, precisely because their storage has been centralised in a single trust authority.
The issue of data storage and sharing plays an important role in the transitional moment in which the Internet finds itself: the cloud has strongly characterised Web2 in a centralised form, however, the new trend in Web3 is Edge Computing. Essentially, data processing will increasingly take place on the ‘periphery’ of the internet, directly on our devices, at the edge of the Internet of Things (IOT). For more on the Web3 concept, see the entry in the glossary.
In order to make systems independent of the problems associated with a central server, both in terms of computation and memory, it is also necessary to develop decentralisation solutions: to ensure the privacy and continuity of services by distributing storage between nodes, whose reliability is guaranteed by impartial protocols rather than by a company’s reputation.
What is IPFS: interplanetary data nebulae
How did your browser come across the Young Platform Academy among the endless pages on the World Wide Web? The HTTP (HyperText Transfer Protocol) manages searches made by the client, transmitting the requested resources from a centralised server.
IPFS (InterPlanetary File System) is the decentralised version of HTTP: a peer-to-peer network of nodes that collaborate in storing and sharing information, thus managing a cloud storage service. However, why do we need to decentralise web content?
First of all, surfing the world wide web is currently inefficient and expensive for client/server systems: HTTP downloads files from one database at a time, while IPFS retrieves information from several nodes simultaneously, saving bandwidth.
Moreover, the cloud storage features of IPFS respect the original vision of the web, making it free and truly equal. Today, the Internet is based on the concepts of ownership and access, decentralisation will replace them with sharing and participation. Through HTTP you receive files from those who hold them in private data centres, if you are allowed access. Whereas, with IPFS, many own each other’s files and share in making them available.
Finally, the distribution between nodes supports the resilience of the Internet, i.e. the persistence of content over time, as well as making the files themselves immune to censorship. With IPFS, files are kept in multiple copies, stored in different locations. This redundancy prevents information from being obscured or deleted by the will of an individual. IPFS gives a new definition of cloud storage: decentralised, equal and secure.
How IPFS works: the Content Identifier
Basically, how does the IPFS cloud work? By revolutionising the concept of accessibility: this protocol retrieves information based on content and not location. This is done through the Content Identifier (CID), a code assigned to each file, which replaces the IP address of the centralised server. Each file is also divided into smaller parts, which are cryptographically linked through a hash mechanism.
When a file found through a CID is viewed or downloaded, client nodes will keep a cached copy of it, thus making the web faster. It is able to obtain content from geographically close peers, instead of receiving it from distant centralised hardware, therefore accelerating knowledge sharing. Finally, in the IPFS cloud, nodes can ‘pin‘ files, so that they are permanently and constantly available at their address.
What can you do with the IPFS cloud? Besides storing data, you can provide services through the p2p CDN (Content Delivery Network). In addition, IPFS is useful for blockchain developers: the CID allows you to store large files off-chain and later retrieve them via immutable links.
No matter if you are isolated or have a poor connection, the inter-planetary decentralisation of IPFS will allow you to access every file with ease, provided that all nodes in the network actively participate. In fact, this universal storage project, which aims to distribute the entire Wikipedia database through content mirroring, can only work if an increasing number of nodes collaborate in the cloud storage of files.
However, it is difficult to achieve scalability based on voluntary participation alone: thus, incentives based on crypto tokens and blockchain technology were added to the cloud storage features of IPFS: FileCoin and Storj.
FileCoin vs StorJ: cheap memory
To get a better understanding of what cloud storage is, imagine you are tidying up your room: you will find all sorts of items in your drawers, but of course you will only store what you are interested in. Likewise, each node in the IPFS cloud will get rid of the cache-copies it does not need. This, however, does not promote the storage interests of others: you would never stack your brother’s comics on your desk!
Filecoin proposes a solution: an incentive protocol, complementary to IPFS cloud storage, that can create a marketplace for distributed storage. Filecoin nodes, which are at the same time peers of IPFS, can ‘rent‘ their storage in exchange for FIL tokens.
The economic incentive ensures the scalability of the IPFS cloud, but how can the reliability of independent and decentralised nodes be ensured? Those providing storage space for Filecoin will only be rewarded if they can provide proof of their work. Verification takes place through Proof-of-Replication (PoRep) and Proof-of-Spacetime (PoSt) mechanisms, two forms of Proof-of-Work.
Thus, miners are responsible for archiving and retrieving files but, in order to perform competitively, they may only be in charge of one of these two actions. Miners who are in charge of archiving, then, must pledge funds as collateral, paying it out proportionally to the amount of data handled, as a further anti-fraud measure.
The mission of the StorJ project is the same: through the token of the same name (STORJ) to encourage users to share their unused storage resources. A small node can have access to cheaper electricity than centralised data centres, thus providing a cheaper cloud alternative.
Payments in StorJ’s network are administered by Satellites, modules that co-ordinate operations between clients and nodes: they reward the repositories for the amount of memory and bandwidth they continuously provide to the cloud.
Satellites will test nodes from time to time, in so-called audits, to check their commitment: if an address shows inactivity, it will be deprived of part of its reward. The confiscated tokens will then be used to recover any damaged files or to create other nodes in the cloud. Satellites are in turn remunerated for their conservative actions and for storing certain metadata (such as the location of files in the cloud).
StorJ’s cloud storage is based on a p2p smart contract that enables the matching of storage space demand and supply, the cost of which is set by the provider nodes through StorJShare. The open-source protocol of the StorJ crypto makes information redundant and, through encryption, secure; in addition, StorJ’s code is compatible with Amazon S3.
The Graph: rearranging the clouds
Let’s go back to our analogy. Ever since you realised what cloud storage on blockchain is, storage rewards have turned your room into a storage room: you gave that space to your brother, but then the rest of the family also made arrangements with you. You are a little node in a storage network, but now try to find your yearbook pictures in this shared archive: you would have to go through every cupboard, box and binder to find it!
The Graph can assist you in your search: its protocol adds data indexing to the decentralised cloud storage of IPFS and the incentives of Filecoin and StorJ. If your room were Ethereum‘s blockchain, each drawer would be a layer 2 blockchain: each transaction category in them would be labelled by The Graph’s network through a description (manifest), so that the information would be collected in files called ‘subgraphs‘.
In a nutshell, how does The Graph work? The protocol ‘learns’ to index through subgraph manifests: these define the smart contract of interest and all the results or data it has generated. Once these are written, the indexers (incentivised by the GRT token) will start to collect data accordingly.
You are in a large library, you have to search for a book. A kind clerk will point you in the right direction. Similarly, The Graph‘s protocol makes decentralised databases even more accessible: subgraphs provide the right questions for developers to query the blockchain, so that they can quickly find the data needed for smart contracts and Dapps to work.
Cloud computing needs a secure repository for information to be processed, just as cloud storage would not be efficient if it were disorganised. That is why the storage of IPFS, the incentives of Filecoin and Storj, and the indexing of The Graph are essential to initiate the next phase of the Internet: Web3.