EGMI stands for Eskimo Gluster Management Interface. EGMI is a daemon running on machines or containers alongside Gluster FS and taking care of managing gluster volumes and peers automatically (for most common operations).
The fundamental idea behind EGMI is that Big Data System administrators should not have to do so much manual operations to build and maintain a gluster cluster with its volumes and peers.
EGMI inspires from the way most widely used Big Data / NoSQL backends manage their nodes, shards and replicas
transparently, balancing new replicas to new nodes automatically whenever a node goes down, etc. without an
administrator needing to really worry about it.
EGMI aims eventually at bringing the same level of automation and reliability on top of Gluster FS and at simplifying
most trivial aspects of gluster volumes management and repairing.
EGMI also includes a web interface for monitoring and to help administrators perform some simple manual operations and configuration.
EGMI is currently in early development stage but still considered production ready (in terms of quality) on a limited set of use cases. EGMI is used by Eskimo for Gluster cluster Management.
Automatic nodes and volumes management
With EGMI, the administrators should define in EGMI’s configuration file:
-
Either a Zookeeper URL when using Zookeeper for master election and node discovery.
-
In this case, the managed nodes are discovered automatically through zookeeper
-
Also, one of the EGMI instance registering to zookeeper is elected as master and will take care of global cluster management.
-
-
Or, in case zookeeper is not used (or used only for master election), a list of Managed nodes: the set of nodes where gluster is expected to be running. Additional nodes discovered in gluster peers are considered as well.
-
The Managed Volumes: the set of gluster volumes to be created and managed. Additional volumes discovered in gluster peers are considered as well.
-
The target number of replicas: Either as a fixed number or a strategy to compute the number, EGMI will manage and extend volumes as required to respect this number whenever possible.
-
The target number of shards: Either as a fixed number or a strategy to compute the number, EGMI will attempt to create as many bricks as required to reach this number of shards (distribution) times the number of replicas if possible.
EGMI is deployed on every node running gluster
EGMI is designed to be deployed alongside Gluster FS on every node or container from the gluster cluster. This is a
strict requirement since every EGMI instance will take care of operating the co-located Gluster FS instance.
At every moment in time, only one of these EGMI process is acting as master and is taking decisions for the whole
cluster.
Using Zookeeper for master election and Node discovery
EGMI uses Zookeeper for master election. Zookeeper is used solely to elect the master taking decisions for the whole
cluster as well as discovering registered nodes and not for state storage.
EGMI stores it’s state directly in Gluster’s meta data.
EGMI can also run without Zookeeper. In such case, the list of expected nodes has to be configured in a static way in
EGMI’s configuration file.
This static declaration of nodes is also required if Zookeeper is used solely for master election and node discovery
is disabled.
Stateless
EGMI itself is stateless. EGMI uses Gluster FS’s own meta-data to store state and always returns to querying Gluster to recover the Gluster cluster state. Optionally, when zookeper is used, some configuration elements (such as registered nodes) are retrieved from zookeeper.
UI Redirection to master node
Every EGMI instance running on every cluster nodes has a Web UI - Web User Interface - up, running and available on the configured port. However, only the master node UI is answering. Slave nodes UI on slave nodes redirect the user to the master node UI.
EGMI needs following software available in order to be built:
-
Open JDK 11 (or compatible) with
java
inPATH
. -
Apache Maven 3.x (or compatible) with
mvn
inPATH
All the rest is expressed as maven dependencies and fetched from maven repositories as part of the maven build proces.
Again, EGMI has to be installed on every machine or container running Gluster FS, installed alongside Gluster.
EGMI is started using the startup script egmi.sh
or the provided SystemD unit file and setup script.
Java 11 (or above) binaries need to be available in the system path to run EGMI.
One just needs to extract the EGMI archive to the root folder where one wants to install EGMI.
EGMI is configured in egmi.properties
configuration file located under sub-folder conf
under the root EGMI
installation folder.
The most essential configuration properties to be adapted whenever EGMI is to be used outside of Eskimo are as follows.
-
zookeeper.urls
: the URL(s) (coma-separated list of IP:PORT where zookeeper is expected.) at which zookeeper server(s) is(are) expected. Whenever this is configured, EGMI will use zookeeper for master election. Leave it blank to force either master or slave on one EGMI instance without using zookeeper. (Seemaster=true|false
below) -
hostname
: the hostname this instance of EGMI is identified by on the gluster cluster (most of the time the IP address of the node) -
server.port
: the port EGMI listens to (both EGMI UI and EGMI command server) -
remote.egmi.port
: the port where the remote EGMI command server listens to. This should in principle be the same port as above. But in case the EGMI master manages a set of remote slaves running on a different port, this can be useful. -
target.ip-addresses
: coma-separated hostnames or IP addresses of the gluster cluster. EGMI will connect all these nodes together (add peers in pool) if some nodes are disconnected from the gluster.
This should be left blank to rely on zookeeper for data node discovery. EGMI is indeed able to discover the various nodes running EGMI / Gluster cluster from zookeeper. Whenever one doesn’t want to rely on zookeeper to discover nodes, this configuration property can be used. -
master
: set totrue
orfalse
to force that very instance to be master or slave on one EGMI instance regardless of zookeeper election process. (This should be used with caution since no checks are done on misconfiguration ending up with multiple masters.)
-
data
: set totrue
to have the node registerd as a data node (managed gluster node) within zookeeper. Set tofalse
to have that EGMI running as a standalone process (without a co-located gluster process to be managed). -
target.volumes
: coma-separated list of volumes to be managed. This has to be given and needs to be consistent across EGMI instances. -
config-storage-path
: where the EGMI runtime configuration (meta-data) has to be stored. EGMI is more or less stateless but some of the discovered nodes or volumes are tracked in a meta-data file stored there. (If this file is deleted, it doesn’t impact EGMI significantly) -
zookeeper.sessionTimeout
: the zookeeper session timeout (used to trigger a new master election) -
master.redirect.URLPattern
: the URL pattern used to redirect users reaching an EGMI slave to the master.
-
target.numberOfBricks
: the number of bricks to create and manage for volumes (either a fixed number orALL_NODES
to have every volume having a brick on every node orLOG_DISPATCH
to have shares and replicas distributed on log(n) nodes) -
target.defaultNumberReplica
: the target number of replicas to try to respect for every node.
Important note
This configuration needs to be aligned on every node. It is not a strict requirement and a configuration discrepency
between nodes may be somewhat tolerated by EGMI.
It could however lead to unexpected results and every node in the gluster cluster should really have the same EGMI
configuration.
EGMI is part of the Eskimo software platform.
Eskimo is Copyright 2019 - 2023 eskimo.sh - All rights reserved.
Author : https://www.eskimo.sh
Eskimo is available under a dual licensing model : commercial and GNU AGPL.
If you did not acquire a commercial licence for Eskimo, you can still use it and consider it free software under the
terms of the GNU Affero Public License. You can redistribute it and/or modify it under the terms of the GNU Affero
Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option)
any later version.
Compliance to each and every aspect of the GNU Affero Public License is mandatory for users who did no acquire a
commercial license.
Eskimo is distributed as a free software under GNU AGPL in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero Public License for more details.
You should have received a copy of the GNU Affero Public License along with Eskimo. If not, see https://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA, 02110-1301 USA.
You can be released from the requirements of the license by purchasing a commercial license. Buying such a commercial license is mandatory as soon as :
-
you develop activities involving Eskimo without disclosing the source code of your own product, software, platform, use cases or scripts.
-
you deploy eskimo as part of a commercial product, platform or software.
For more information, please contact eskimo.sh at https://www.eskimo.sh
The above copyright notice and this licensing notice shall be included in all copies or substantial portions of the Software.