Monitoring Service with Zabbix - Installing from Source Code

Yesterday, I completed the installation process of Zabbix, and today I will try using Zabbix agent to monitor a new server.(English version Translated by GPT-3.5, 返回中文)

Copying Zabbix Agent, No, Compiling Arm Version of Zabbix Agent

First, go back to the Zabbix server and navigate to the installation directory of Zabbix.

1
/usr/local/zabbix

The structure of this directory is as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
tree of /usr/local/zabbix
├── bin
│ ├── zabbix_get
│ ├── zabbix_js
│ └── zabbix_sender
├── etc
│ ├── zabbix_agentd.conf
│ ├── zabbix_agentd.conf.d
│ ├── zabbix_server.conf
│ └── zabbix_server.conf.d
├── lib
│ └── modules
├── php-webui (隐藏1460个文件)

├── sbin
│ ├── zabbix_agentd
│ └── zabbix_server
└── share
├── man
│ ├── man1
│ │ ├── zabbix_get.1
│ │ └── zabbix_sender.1
│ └── man8
│ ├── zabbix_agentd.8
│ └── zabbix_server.8
└── zabbix
├── alertscripts
└── externalscripts

Since the machine being monitored only needs to install zabbix_agentd, we need to prepare zabbix_agentd and etc/zabbix_agentd.conf here and then copy them to the target host, which is the Oracle machine I initially wanted to monitor. However, this won’t work directly because my machine is an Arm-based server while zabbix_agentd is for x86 architecture. So, we need to compile another zabbix_agentd on the Arm host.

The compilation process is omitted, please refer to the previous article. Compilation time: 12 seconds.

The compilation command is as follows:

1
./configure --enable-agent --enable-ipv6 --with-net-snmp --with-libcurl --with-libxml2 --with-openssl --prefix=/usr/local/zabbix

If an configure error occurs during the process, such as what I encountered in the last step:

1
2
3
4
5
6
7
config.status: error: Something went wrong bootstrapping makefile fragments
for automatic dependency tracking. If GNU make was not used, consider
re-running the configure script with MAKE="gmake" (or whatever is
necessary). You can also try re-running configure with the
'--disable-dependency-tracking' option to at least be able to build
the package (albeit without support for automatic dependency tracking).
See `config.log' for more details

Pay attention to the contents of config.log. In my case, the failure was caused by the absence of make installation.

Creating a New Entry in Zabbix Server

In the Zabbix web interface, go to “Monitoring - Hosts” on the left side, and click “Create Host” on the top right corner.
Add Host
Here, I added a host named “Oracle ARM”.

Add a Host Entry

Then switch to the “Encryption” tab, select “Connect to host, choose Pre-shared Key (PSK)”, and uncheck all options except for “Pre-shared Key (PSK)”.

Note that according to 2 Using Pre-Shared Keys - zabbix.com, the description goes as follows:

Do not put sensitive information in the PSK identity string – it is transmitted over the network unencrypted.

I understand “PSK identity” as the name that anyone can find out, while “Pre-shared Key (PSK)” is the password that only I know.

PSK Configuration

The PSK key can be generated using OpenSSL. It’s simple, just do it like this. Of course, the documentation also provides a way to generate using GnuTLS, but I used OpenSSL here.

1
openssl rand -hex 32

Navigate back to

1
c59a90b393b24f640055682882f67c8e1479dbf598e66f544fd5afa692cd820e

and enter the value above into the PSK field. Save it. After that, the entry should look like this:

List of Hosts

Configuring Zabbix_agentd in Active Mode

The server is already configured, so now let’s configure the client. Go to the server that needs to be monitored, the Arm-based server.

Writing zabbix_agentd.psk

Create a file named /usr/local/zabbix/etc/zabbix_agentd.psk and write the content below into the file.

1
echo 'c59a90b393b24f640055682882f67c8e1479dbf598e66f544fd5afa692cd820e' > /usr/local/zabbix/etc/zabbix_agentd.psk

Configuring conf

First, set the configuration.

1
2
3
4
5
6
7
8
9
10
# 将server设置成0,即关闭本机的10050端口检测,这样的话就是被动接收模式,如果这里配置了就是由server进行主动推送
Server=0
# 将这个值设置成0,将会禁用agent被动检查(对于server端主动,则agent端是被动,对于server端被动收到消息,那么agent端就是主动)
StartAgents=0
# 这里的serverActive填写是zabbix server的ip地址,默认访问zabbix server的10051端口
ServerActive=121.41.65.201
Hostname=Oracle ARM
# 如果希望用root运行,就将这个设置为1,默认是0,默认使用zabbix用户运行。
# 我这里为了测试,设置成1允许root运行
AllowRoot=1

Then configure the encryption part. Just follow the documentation.

1
2
3
4
TLSConnect=psk
TLSAccept=psk
TLSPSKFile=/usr/local/zabbix/etc/zabbix_agentd.psk
TLSPSKIdentity=oracle key

Starting zabbix_agentd

The last step is to start the service.

1
sbin/zabbix_agentd

You can see that the zabbix_agentd.log file is generated in /tmp. Due to the possibly poor network of the overseas server, there is a chance of failure. However, if you see Active check configuration update from [xxx.xxx.xxx.xxx] is working again, it means that the data has been sent.

1
2
3
4
5
6
7
8
9
92311:20220520:135116.883 Starting Zabbix Agent [Oracle ARM]. Zabbix 6.0.4 (revision 3d787ff402e).
92311:20220520:135116.883 **** Enabled features ****
92311:20220520:135116.883 IPv6 support: YES
92311:20220520:135116.883 TLS support: YES
92311:20220520:135116.883 **************************
.......
92316:20220520:135119.887 Unable to connect to [xxx.xxx.xxx.xxx]:10051 [TCP successful, cannot establish TLS to [[121.41.65.201]:10051]: SSL_connect() timed out]
92316:20220520:135119.887 Active check configuration update started to fail
92316:20220520:135220.618 Active check configuration update from [xxx.xxx.xxx.xxx] is working again

If you see the following error, it means that the Zabbix server port is not accessible. Check the firewall or check if the Zabbix server is listening on 127.0.0.1 (usually the reason is the firewall, as in my case).

1
2
3
89976:20220520:073726.816 agent #5 started [active checks #1]
89976:20220520:073729.818 Unable to connect to [xxx.xxx.xxx.xxx]:10051 [cannot connect to [[xxx.xxx.xxx.xxx]:10051]: [4] Interrupted system call]
89976:20220520:073729.818 Active check configuration update started to fail

If you encounter the following error, check if the PSK key configured on the web page matches the content in /usr/local/zabbix/etc/zabbix_agentd.psk.

1
92248:20220520:134652.850 Unable to connect to [xxx.xxx.xxx.xxx]:10051 [TCP successful, cannot establish TLS to [[xxx.xxx.xxx.xxx]:10051]: SSL_connect() set result code to SSL_ERROR_SSL: file ../ssl/record/rec_layer_s3.c line 1543: error:14094417:SSL routines:ssl3_read_bytes:sslv3 alert illegal parameter: SSL alert number 47: TLS read fatal alert "illegal parameter"]

The final result is as follows:

Monitoring Data

Appendix: Configuring Zabbix_agentd in Passive Mode

Next, I will configure the Agent passive mode for another AMD server (i.e., Zabbix Server actively obtains data from the Agent).

First, configure a new host

I won’t repeat the description here. The configuration is the same as before. For the template, I chose “Linux by Zabbix agent”. This template provides default monitoring items prepared by Zabbix. I will customize the monitoring items later.

Configure New Host

Configure Key

After configuration, a new entry will be added to the list, marked with ZBX

ZBX Mark

The following content is quoted from the 2 Creating a Host - zabbix.com document:


  • ZBX indicates that the host status has not been established and no monitoring item checks have been made
  • ZBX indicates that the host is available and monitoring item checks have been successful
  • ZBX indicates that the host is unavailable and monitoring item checks have failed (move the mouse cursor over the icon to view the error message). This may be due to communication issues caused by incorrect interface credentials. Check if the Zabbix server is running and try refreshing the page later.

Next, copy sbin/zabbix_agentd and etc/zabbix_agentd.conf from the Zabbix server to the server being monitored (since the first server is an Arm server, we need to compile it ourselves. The AMD server belongs to x86 architecture, so no need to compile.). First, create two directories:

1
2
mkdir -p /usr/local/zabbix-agent/sbin
mkdir -p /usr/local/zabbix-agent/etc

Then copy sbin/zabbix_agentd to /usr/local/zabbix-agent/sbin and etc/zabbix_agentd.conf to /usr/local/zabbix-agent/etc. Before starting, don’t forget to back up zabbix_agentd.conf:

1
2
3
4
5
/usr/local/zabbix-agent
├── etc
│ ├── zabbix_agentd.conf
└── sbin
└── zabbix_agentd

Configure zabbix_agentd.conf

1
2
3
4
5
6
7
8
9
10
# 1. 这里填上zabbix 服务器的ip,这里表示只允许这个ip访问
Server=your.zabbix.server.ip.address
# 2. 默认的zabbix_agentd监听端口是10050,所以如果端口不是10050或者有nat,这里调成可用端口
ListenPort=10050

# 3. 这里注释掉,注释掉后会禁用agentd主动推送模式
#ServerActive=127.0.0.1

# 4. 这里配上在zabbix web上写的主机名
Hostname=Oracle AMD

Configure the PSK Key

First, write the key that was filled in the web page into /usr/local/zabbix-agent/etc/zabbix_agentd.psk, and then configure it as follows:

1
2
3
4
TLSConnect=psk
TLSAccept=psk
TLSPSKFile=/usr/local/zabbix-agent/etc/zabbix_agentd.psk
TLSPSKIdentity=oracle key2

Starting Zabbix Agent

This time, we need a different command because this file is obtained from the zabbix-server, and the default configuration file of zabbix_agentd is in /usr/local/zabbix/etc/zabbix_agentd.conf instead of the new zabbix-agent folder. Therefore, we must specify the configuration file here.

1
sbin/zabbix_agentd -c etc/zabbix_agentd.conf

Then, use netstat -anp | grep 10050 to check if the port is listening.

1
2
tcp        0      0 0.0.0.0:10050           0.0.0.0:*               LISTEN      73453/sbin/zabbix_a 
tcp6 0 0 :::10050 :::* LISTEN 73453/sbin/zabbix_a

Wait for a while, and you will see the data.

Data Received

Data Available

Finally, Running Zabbix_agentd as a Service

At this point, both the Zabbix server and the two Zabbix agents are running temporarily in the background. Once restarted, everything will be lost. Therefore, they need to be added as services. But first, some further configuration changes are needed. Create a “logs” directory under the zabbix-agent directory.

1
mkdir /usr/local/zabbix-agent/logs

Then, modify the following configuration:

1
2
3
4
5
6
# 1. 修改Pid的文件,因为linux中 /tmp目录会被定时清除,所以要将pid生成到一个不会被清除的目录
PidFile=/usr/local/zabbix-agent/logs/zabbix_agentd.pid
# 这个其实还是在/tmp下没关系,但是既然创了logs,就放一起把
LogFile=/usr/local/zabbix-agent/logs/zabbix_agentd.log
# 然后将允许Root关闭掉,为了安全(关闭后默认以zabbix执行,所以稍后要创建一个zabbix用户)
AllowRoot=0

Create a “zabbix” user. Here, -M means not to create a home directory.

1
2
groupadd zabbix
useradd -g zabbix -M -s /sbin/nologin zabbix

Change the user ownership for logs.

1
chown -R zabbix:zabbix /usr/local/zabbix-agent/logs

Then, write the following content into /etc/systemd/system/zabbix-agent.service

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[Unit]
Description=Zabbix agent server
After=syslog.target network.target

[Service]
Type=simple
User=zabbix
Group=zabbix
PIDFile=/usr/local/zabbix-agent/logs/zabbix_agentd.pid
ExecStart=/usr/local/zabbix-agent/sbin/zabbix_agentd -c /usr/local/zabbix-agent/etc/zabbix_agentd.conf -f
PrivateTmp=true
RestartSec=5s

[Install]
WantedBy=multi-user.target

Finally, start the service.

1
systemctl start zabbix-agent

Set it to start on boot.

1
systemctl enable zabbix-agent

That’s it. We’re done.