+3 votes
282 views
in Know how by (242k points)
reopened
GlusterFS? What is it about?

1 Answer

+4 votes
by (1.6m points)
edited
 
Best answer

What and who is behind GlusterFS?
How GlusterFS works
Advantages and disadvantages of GlusterFS
GlusterFS Application Examples
Alternatives to GlusterFS

image

GlusterFS? What is it about?

GlusterFS is a distributed and scalable file system at any level that brings together the storage units of different servers in a single system. File systems work hidden, so almost no one thinks about them again after they have been installed. However, one remembers them quickly when data is lost or file system limits are reached: if, for example, the maximum size of a partition is not sufficient or there are limitations in the storage path..

Index
  1. What and who is behind GlusterFS?
  2. How GlusterFS works
  3. Advantages and disadvantages of GlusterFS
  4. GlusterFS Application Examples
  5. Alternatives to GlusterFS

What and who is behind GlusterFS?

The name Gluster is an acronym formed from the acronym GNU (in turn a recursive acronym for GNU? S not Unix ) and the term cluster . The system is released under the GNU General Public License (GNU-GPLS) and can therefore be used free of charge . The term cluster, which means something like heap or group , describes, in the context of data carriers, the logical synthesis of various physical storage media. Thus, when we talk about computers, cluster refers to a set of several interconnected systems. GlusterFS combines this concept with that of GNU and unifies the storage space of several computers, to later display it as a single logical unit.

The project was presented in 2005 by the company Gluster Inc., which in 2011 was acquired by RedHat, Linux distributor. Since then, RedHat continues to develop the file system. In January 2020, the seventh version of GlusterFS was released, which is available in compiled format for the following Linux distributions:

  • CentOS
  • Debian
  • Fedora
  • RedHat / RHEL
  • SUSE
  • Ubuntu

The reason it only works on Unix-based systems is that memory is integrated through the FUSE module, which is not yet available for Windows with the necessary stability..

Note

FUSE is the acronym for Filesystem in Userspace . Operating systems are usually divided into two spaces: the user and the kernel . The latter is highly protected and is only accessible with administrator rights, among other security measures. For this reason, these rights are usually required to mount or integrate ( to mount ) and manage disk drives. FUSE, however, also allows users to manage file systems.

Computers can function as servers and clients. However, mere access to the file system is also possible from other systems. NFS (Network File System) and SMB / CIFS (Server Message Block / Common Internet File System) are supported..

How GlusterFS works

A distributed file system is convenient only if several computers are connected to each other. According to the official description of GlusterFS, at least three servers are required , but these are not servers in the literal sense, but can be practically any type of physical or emulated hardware . Virtual machines can also be used, in addition to all kinds of computers, which have many advantages, especially in terms of flexibility.

Integrated servers operate as nodes (nodes) and connected through the TCP / IP network. The connected devices thus form what is called a trusted pool , that is, a set of trusted servers. These servers make their memories available to the user in the form of bricks (bricks) , in which volumes are finally created (units of volume) , which can then be incorporated and used as normal storage media. The computers that access the system are called clients or clients and it is possible that they also function as a server at the same time.

A special feature of GlusterFS is its great scalability , which allows as many nodes and bricks as desired to be added later . In this way, the storage space can be continuously adapted to the requirements of each moment. The maximum size of storage space that can be handled is several petabytes .

Additionally, GlusterFS ensures that data is not lost in the event of a crash by saving it redundantly . In this way, the risk is spread over several systems, which may be on separate physical media. It is also possible to create RAID-type clusters , which require that a copied drive ( replicated volume ) be added instead of a distributed one ( distributed volume ), as is usually the default option. This copied drive saves each file in duplicate and corresponds to the so-called mirror RAID .

Done

A Redundant Array of Independent Disks (RAID) is a set of hard drives that are physically independent of each other, but from which a unified storage unit is created. Depending on the goal of the suite, more efforts can be invested in making it fast or keeping the data secure. Choosing the second priority will reduce storage space by repeatedly saving data or additional information required to restore it in case of loss.

For actions on storage space, GlusterFS offers ten predefined translators , which take care of transform requests made with user rights so that they can be executed in memory. Some examples of translators are Storage , which stores data on the local file system and regulates access to it; and Encryption , which is responsible for encryption.

Not long ago, there has been the geo - replication feature , which allows for asynchronous distribution of data on servers located in separate locations. This action also offers protection against possible physical damage to the servers, such as those caused by fire or theft. In georeplicación, a computer adopts the role of master ( master ) and the other the slave ( slave ) and the data transmission is secured using SSH (Secure Shell).

Advantages and disadvantages of GlusterFS

In the following table we have summarized the main advantages and disadvantages of a distributed file system compared to a conventional network storage system:

Gluster advantages Gluster drawbacks
Good use of capacities Complex network structure
Greater safety against breakdowns Greater administrative effort to install it
Network load distribution Requires a fast network infrastructure
Very good scalability Technical safety requires additional work

GlusterFS Application Examples

In principle, GlusterFS represents a model cloud : a storage space based on a set of computers and made available to connected clients. This system is especially useful in large networks that already have enough resources to create a cluster , that is, many computers that can be connected to each other.

Since the devices are connected to each other using the Internet protocol, business structures with multiple branches are especially suitable for implementing a distributed file system. However, smaller geographic network structures can also take advantage of this system to avoid having to have their own network storage space, all without even sacrificing redundancy in storage.

advice

Do you want to work with GlusterFS yourself? IONOS has a detailed article in English with step-by-step instructions for installing and configuring the file system.

Alternatives to GlusterFS

The most popular alternative to GlusterFS is Ceph , which is also available for free and offers many of the advantages described above that are unique to distributed file systems. However, both Ceph and Gluster have different advantages and disadvantages.

For high-performance computing systems, one of the Fraunhofer institutes developed BeeGFS (formerly FhGFS), which is also offered free of charge and in which user friendliness has been prioritized.

In the commercial sphere there are also systems such as Microsoft's Storage Spaces Direct (S2D), which, on the other hand, can only be used with Windows servers subject to a paid license.


...