+5 votes
391 views
in Technical issues by (242k points)
reopened
InfluxDB: explanation, benefits and first steps

1 Answer

+3 votes
by (1.6m points)
edited
 
Best answer

What is InfluxDB?
When is InfluxDB used?
What are the advantages of InfluxDB?
First steps in InfluxDB
An overview of the most important news in InfluxDB Cloud 2.0

image

InfluxDB: explanation, benefits and first steps

By recording scientific or technical measurement data using sensors, huge amounts of data are generated in a very short time to be processed together with the time stamp of the measurement moment. These time series data require special databases. This article is about InfluxDB, a database management system (DBMS) that has been developed specifically for this task..

Index
  1. What is InfluxDB?
  2. When is InfluxDB used?
  3. What are the advantages of InfluxDB?
  4. First steps in InfluxDB
  5. An overview of the most important news in InfluxDB Cloud 2.0

What is InfluxDB?

InfluxDB is a database management system developed by InfluxData, Inc. InfluxDB is open source software and can be used for free. The commercial version? InfluxDB Enterprise? it offers maintenance contracts and special access controls for commercial clients and is installed on a server within the company network.

The latest version, InfluxDB 2.0, is also available as a fully customizable cloud service with a web-based user interface for recording and visualizing data..

Management system InfluxDB data base has been developed in Go , the programming language of Google, also known as Golang. In the first version, the InfluxQL query language was used to query external databases, an original creation of the manufacturer. Instead, InfluxDB 2.0 served to introduce the new Flux programming language , published as open source software by the InfluxData company on GitHub under the MIT license. There, they continue to develop this project with the participation of other developers who work with time series data.

Flux is a stand-alone scripting and query language for time series databases (TSDB). It can be used from version InfluxDB 1.7 or completely independently and even in combination with databases from other providers..

Flux has been optimized for the ETL (Extract, Transform, Load) process in databases and is not compatible with the InfluxQL query language that was previously used. However, the vendor plans to develop a migration path for its regular customers that includes the translation of the InfluxQL code to Flux.

Flux syntax is based on the popular JavaScript scripting language, making it easy to learn and flexible in use. An essential feature of Flux is the compatibility with different data sources , for example, by using third-party APIs.

In this way, Flux can work with analysis tools such as Jupyter. The Apache Arrow data exchange interface allows communication with other systems and integration in Big Data environments.

When is InfluxDB used?

InfluxDB is designed for time series databases (TSDB), which store time series . These databases are used, among other things, to store and evaluate data from sensors or protocols with timestamps for a specified period of time.

In these cases, millions of data sets, such as those provided by Internet of Things equipment or scientific measurement instruments, may enter through a continuous stream of data. This type of data must be processed quickly as soon as it reaches the database.

For this reason, InfluxDB has a time service that uses the Network Time Protocol (NTP) to guarantee that time is synchronized in all systems.

InfluxDB databases are usually very compact and only need two or three columns . For example, the source of the data, the value itself and the corresponding timestamp are stored there.

Sensor Value Time
Sensor 1 140.50 04/23/2020 @ 10:00
Sensor 2 110.02 04/23/2020 @ 10:00
Sensor 1 142.32 04/23/2020 @ 10:05
Sensor 2 110.50 04/23/2020 @ 10:05
? ? ?

InfluxDB distinguishes between tags and fields . While tags only contain metadata included in the index, fields include values ​​that can be evaluated later. Therefore, in our example, the first column is a tag and the second, a field . This distinction facilitates the management of the database and the evaluation of the measurement data.

What are the advantages of InfluxDB?

TSDBs like InfluxDB are much faster than relational databases at storing and processing time- stamped measurement data. A database management system (DBMS) dedicates part of its performance to organizing a complex index, which is not used in this scope of application. InfluxDB is also able to maintain a high writing speed, since it uses a very simple index.

Unlike the previous version 1.x, InfluxData offers a cloud solution with the new InfluxDB Cloud 2.0 for Amazon Web Services (AWS), Google Cloud Platform (GCP) or Microsoft Azure. In the case of so-called Serverless-Computing (serverless computing) the client does not need its own server infrastructure.

In the cloud variant, it is not necessary to reserve individual servers, as the system automatically adapts to the current load of activity. This is a great advantage in industrial IoT or machine learning applications, due to the great tendency to oscillate the volumes of the data. The first version still lacked certain components to function, the so-called TICK-Stack with products such as Telegraf, Chronograf and Kapacitor. Instead, InfluxDB 2.0 already includes everything you need.

In the local variant, which is installed on its own server, the entire database management system is also concentrated in a single program file that, to date, is only available for 64-bit Linux, Linux for ARM processors , macOS and as a Docker container. Telegraf may continue to be used as a collection agent for InfluxDB 2.0, regardless of other similar agents.

First steps in InfluxDB

As a welcome gift, InfluxDB offers free access to InfluxDB Cloud 2.0. This not only allows access to the database on a test basis, but also to the entire hosted time series data platform with multi-user capability. InfluxDB Cloud 2.0 also includes modules for collecting, evaluating and visualizing stored data.

The free version is subject to read and write restrictions: up to a maximum of 10,000 data sets and a maximum storage period of 30 days. As a general rule, these restrictions do not prevent the realization of amateur projects, so the free version is sufficient in these cases. This free version can later be upgraded to a paid version without losing the data already stored.

As a first step, you must create a free user account on the InfluxDB Cloud 2.0 registration page. Click the confirmation link in the email.

After the user account is verified, log in and select a cloud provider. In Europe, InfluxDB Cloud 2.0 currently only works on Amazon Web Services (AWS), but this does not hinder being able to use the free version. If you are already a user of Amazon Web Services or Google Cloud Platform (GCP), you can subscribe to InfluxDB-Cloud products through the stores of the cloud service providers.

Once you have logged in, InfluxDB shows you your personal dashboard where you can collect and view your data. You can collect data using Telegraf drivers, the InfluxDB v2 API, the Influx command line interface (CLI), or directly in the InfluxDB user interface. Client libraries are also available for different popular programming languages.

image
On the InfluxDB Cloud 2.0 home screen, the data is loaded into the project and its own panels are created (source: https://cloud2.influxdata.com/).

You can create Telegraf configurations interactively or use existing ones to send the data to the InfluxDB-Cloud 2.0 instance. Once the InfluxDB-Cloud is configured to collect data, it creates personal dashboards to query and visualize the data.

In the InfluxDB data explorer, you can explore and visualize the collected data. Here you can adapt the update times and viewing time periods to your liking. The InfluxDB user interface offers a number of attractive viewing options. Through the web interface you can easily switch between Flux creator and manual editing of database queries.

Also, on the? Usage? You can check the use of the database at any time to estimate if perhaps a paid version will compensate you.

An overview of the most important news in InfluxDB Cloud 2.0

Free version (with limitations): you do not need downloads, installations or your own in-house server infrastructure; Direct introduction to InfluxDB 2.0 technology, the free version has been designed to familiarize you with InfluxDB and carry out small hobbyist projects.

Flux Compatibility : Flux is a time series database query and scripting language that enables increased productivity through easy code reuse. Flux has been developed and optimized to work with data in InfluxDB 2.0, but it can also be used in combination with other data sources.

Homogeneous API - The homogeneous InfluxDB-v2 API allows access to all InfluxDB components such as data collection, query, storage, and visualization. This way, you can switch from the installed open source version to InfluxDB Cloud 2.0 without any problem.

Visualization and dashboards : based on the trusted Chronograf project from the first version of InfluxDB, the new cloud user interface enables results to be viewed and queried much faster and in real time.

Pay-As-You-Go Versions : Usage-based billing offers more flexibility than a self-hosted database system, plus it gives you the guarantee that you only pay for what you actually use.


...