SUMMARY
This error occurs when a managing system cannot communicate with one of its sources or managed systems.
ISSUE
This issue appears in two forms:
When clicking on a system in the navigation pane, the error message pops up "Database on system with hostname ______ is not responding"
Alternatively, in the navigation pane of the legacy UI, you may see an '!' next to the system. Hovering over the system causes a tooltip to pop-up stating "This system was incompletely loaded".
RESOLUTION
- From the command line, ping the source system to verify that the target is able to communicate with the source.
- Verify that the source system is listed as "is_active" in the target's database. This command will automatically correct the database if the source is listed as inactive.
psql -U postgres bpdb -c "update bp.systems set is_active='t'"
- Verify that you can communicate with the source system database, and that the source system has the target system listed as a manager. (HOSTNAME in the sample below should be replaced by the actual hostname of the managed system)
psql -U postgres bpdb -h HOSTNAME -c "select * from bp.managers"
Only the last part of pg_hba.conf is configurable, and should look like this:
# TYPE DATABASE USER ADDRESS METHOD # "local" is for Unix domain socket connections only local all all trust # IPv4 local connections: host all all 127.0.0.1/32 trust # IPv6 local connections: host all all ::1/128 trust # Allow replication connections from localhost, by a user with the # replication privilege. host bpdb +bpexch,wguest 0.0.0.0/0 md5 hostssl bpdb postgres 172.17.3.1/32 trust
These lines are required at minimum but there may also be additional entries.
pg_service.conf should appear as follows:
[localhost] user=postgres connect_timeout=5 [connpooldb] user=postgres dbname=pgbouncer host=localhost port=6432 connect_timeout=5 [upsilon] user=postgres host=localhost port=6432 sslmode=prefer connect_timeout=30 [HOSTNAME] user=postgres host=HOSTNAME connect_timeout=3 sslmode=prefer
Where HOSTNAME is again replaced by the hostname of the managed system. Make any needed corrections to these files, restart the database, and then repeat step 2 to resolve the issue.
CAUSE
This error occurs when communication is interrupted between the two systems, often due to network outages.
Damage to the pg_hba and pg_service files is usually caused by an improper dump and reload of the database.