cancel
Showing results for 
Search instead for 
Did you mean: 

Error in PEM worker on PEM server

SOLVED
Level 3 Adventurer

Re: Error in PEM worker on PEM server

Hi,

I'm now able to dump pem.alert_status. I will upload to you share drive.

 

Kind regards,

Paul-Emmanuel Devadder

 

NB: I restart pemagent and I have still the same error about alert_id=6 missing

EDB Team Member

Re: Error in PEM worker on PEM server

Hi Paul,

 

Thank you for your response. However, we kindly request you to reply to the PostgresRocks request via PostgresRocks portal.

As when you reply to the notification email from your inbox that creates a new thread on the PostgresRocks and this might create confusion due to missing previous information of previous thread.

 

I am currently copying your response on the previous thread for future reference. 

 

Warm Regards,

 

EDB Team Member

Re: Error in PEM worker on PEM server

Hi Paul,

 

Copying your response sent via a different thread.

 

--Quote--

Hi,

I'm now able to dump pem.alert_status. I will upload to you share drive.

 

Kind regards,

Paul-Emmanuel Devadder

 

NB: I restart pemagent and I have still the same error about alert_id=6 missing

--UnQuote--

 

We are waiting for a reply on the dump file upload and the output of previously shared procedure which will tell the corrupted ctid's.

 

Warm Regards.

Level 3 Adventurer

Re: Error in PEM worker on PEM server

Hi,

After execution of the function provided, 3 records was in the table:

pem=# select * from edb_corrupted_rows ;
schemaname |  tablename   |  t_ctid  | sqlstate |                             sqlerrm
------------+--------------+----------+----------+------------------------------------------------------------------
pem        | alert_status | (86,10)  | XX000    | missing chunk number 0 for toast value 1460192 in pg_toast_19233
pem        | alert_status | (312,28) | XX000    | missing chunk number 0 for toast value 1460190 in pg_toast_19233
pem        | alert_status | (345,12) | XX000    | missing chunk number 0 for toast value 1460162 in pg_toast_19233
(3 rows)
 
pem=# delete from pem.alert_status where ctid in ( select t_ctid from edb_corrupted_rows );
DELETE 3
pem=# \q

So that's why I was able to dump alert_status

 

Kind regards,

Paul-Emmanuel Devadder

EDB Team Member

Re: Error in PEM worker on PEM server

Hi Paul,

 

Thank you for your update. It is good to hear that you are able to successfully delete the corrupted rows from the database and successfully able to dump the table in a flat file.

 

Warm Regards,

Dhananjay

Level 3 Adventurer

Re: Error in PEM worker on PEM server

Hi,

Error in Worker log still present "alert_id=6 not present in alert"

"Update on table "alert_status" violates foreign key constraint "alert_status_alert_id_fkey"

 

Kind regards,

Paul-Emmanuel

EDB Team Member

Re: Error in PEM worker on PEM server

Hi Paul,

 

Can you please share the output of below commands. This will help us to identify the issue in more details.

 

\d pem.alert_status

\d pem.alert

 

Select * from pem.alert_status where alert_id='6';

select * from pem.alert where id='6';

 

Note: All above commands need to be executed on the pem database.

 

Regards.

Level 3 Adventurer

Re: Error in PEM worker on PEM server

Hi,

Find here the output of requested queries:

Table "pem.alert_status"
       Column        |           Type           | Collation | Nullable | Default

---------------------+--------------------------+-----------+----------+---------
alert_id            | integer                  |           | not null |

 current_value       | numeric                  |           |          |

 current_state       | pem.alert_state          |           |          |

 current_state_since | timestamp with time zone |           |          |

 last_processed      | timestamp with time zone |           | not null |

 state_change_count  | integer                  |           | not null | 0
info                | text                     |           |          |

 info_cols           | text[]                   |           |          |

 info_vals           | text[]                   |           |          |

 display_value       | text                     |           |          |

Indexes:
    "alert_status_alert_id_key" UNIQUE CONSTRAINT, btree (alert_id)
Foreign-key constraints:
    "alert_status_alert_id_fkey" FOREIGN KEY (alert_id) REFERENCES pem.alert(id) ON UPDATE RESTRICT ON DELETE CASCADE
Triggers:
    detail_alert_information BEFORE INSERT OR UPDATE ON pem.alert_status FOR EACH ROW EXECUTE PROCEDURE pem.get_detail_alert_info()
    log_alert_history BEFORE INSERT OR UPDATE ON pem.alert_status FOR EACH ROW EXECUTE PROCEDURE pem.log_alert_history()
    send_notifications_trigger AFTER INSERT OR UPDATE ON pem.alert_status FOR EACH ROW EXECUTE PROCEDURE pem.send_notifications()
 
                                                      Table "pem.alert"
              Column               |           Type           | Collation | Nullable |                Default               

-----------------------------------+--------------------------+-----------+----------+---------------------------------------
id                                | integer                  |           | not null | nextval('pem.alert_id_seq'::regclass)
name                              | text                     |           |          |

 enabled                           | boolean                  |           | not null | true
template_id                       | integer                  |           |          |

 agent_id                          | integer                  |           | not null |

 server_id                         | integer                  |           |          |

 database_name                     | text                     |           |          |

 schema_name                       | text                     |           |          |

 package_name                      | text                     |           |          |

 object_name                       | text                     |           |          |

 params                            | text[]                   |           |          |

 operator                          | "char"                   |           | not null |

 thresholds                        | numeric[]                |           | not null |

 check_frequency                   | integer                  |           | not null | 1
history_retention                 | integer                  |           | not null | 30
error_message                     | text                     |           |          |

 acknowledged                      | boolean                  |           | not null | false
email_group_id                    | integer                  |           |          |

 send_email                        | boolean                  |           | not null | false
last_mail_send                    | timestamp with time zone |           |          |

 flapping_detected                 | boolean                  |           | not null | false
last_flapping_detection_processed | timestamp with time zone |           | not null | current_timestamp
send_trap                         | boolean                  |           | not null | false
snmp_trap_version                 | integer                  |           | not null | 2
error_timestamp                   | timestamp with time zone |           |          |

 low_send_trap                     | boolean                  |           | not null | false
low_email_group_id                | integer                  |           |          |

 med_send_trap                     | boolean                  |           | not null | false
med_email_group_id                | integer                  |           |          |

 high_send_trap                    | boolean                  |           | not null | false
high_email_group_id               | integer                  |           |          |

 execute_script                    | boolean                  |           | not null | false
execute_script_on_clear           | boolean                  |           | not null | false
execute_script_on_pem_server      | boolean                  |           | not null | false
script_code                       | text                     |           |          |

 submit_to_nagios                  | boolean                  |           | not null | false
auto_created                      | boolean                  |           | not null | false
Indexes:
    "alert_pkey" PRIMARY KEY, btree (id)
    "alert_name_object_type_uniq" UNIQUE, btree (COALESCE(name, 'dummy_name'::text), agent_id, COALESCE(server_id, 0), COALESCE(database_name, 'dummy_database_name'::text), COALESCE(schema_name,
 'dummy_schema_name'::text), COALESCE(package_name, 'dummy_package_name'::text), COALESCE(object_name, 'dummy_object_name'::text))
Check constraints:
    "check_alert_params_array_size" CHECK (pem.check_alert_params_array_size(template_id, params))
    "check_frequency_gt_zero" CHECK (check_frequency > 0)
    "history_gt_zero" CHECK (history_retention > 0)
    "valid_operators" CHECK (operator = ANY (ARRAY['<'::"char", '>'::"char"]))
    "validate_thresholds_ordering" CHECK (array_upper(thresholds, 1) = 3 AND thresholds[1] IS NOT NULL AND thresholds[2] IS NOT NULL AND thresholds[3] IS NOT NULL AND
CASE
    WHEN operator = '<'::"char" THEN thresholds[1] > thresholds[2] AND thresholds[2] > thresholds[3]
    WHEN operator = '>'::"char" THEN thresholds[1] < thresholds[2] AND thresholds[2] < thresholds[3]
    ELSE false
END)
Foreign-key constraints:
    "alert_email_group_id_fkey" FOREIGN KEY (email_group_id) REFERENCES pem.email_group(id) ON UPDATE RESTRICT ON DELETE SET DEFAULT
    "alert_high_email_group_id_fkey" FOREIGN KEY (high_email_group_id) REFERENCES pem.email_group(id) ON UPDATE RESTRICT ON DELETE SET DEFAULT
    "alert_low_email_group_id_fkey" FOREIGN KEY (low_email_group_id) REFERENCES pem.email_group(id) ON UPDATE RESTRICT ON DELETE SET DEFAULT
    "alert_med_email_group_id_fkey" FOREIGN KEY (med_email_group_id) REFERENCES pem.email_group(id) ON UPDATE RESTRICT ON DELETE SET DEFAULT
    "alert_template_id_fkey" FOREIGN KEY (template_id) REFERENCES pem.alert_template(id) ON UPDATE RESTRICT ON DELETE CASCADE
Referenced by:
    TABLE "pem.alert_history" CONSTRAINT "alert_history_alert_id_fkey" FOREIGN KEY (alert_id) REFERENCES pem.alert(id) ON UPDATE RESTRICT ON DELETE CASCADE
    TABLE "pem.alert_status" CONSTRAINT "alert_status_alert_id_fkey" FOREIGN KEY (alert_id) REFERENCES pem.alert(id) ON UPDATE RESTRICT ON DELETE CASCADE
Triggers:
    alert_error_timestamp BEFORE UPDATE ON pem.alert FOR EACH ROW WHEN (old.error_message IS DISTINCT FROM new.error_message) EXECUTE PROCEDURE pem.log_alert_error_timestamp()
 
alert_id | current_value | current_state | current_state_since | last_processed | state_change_count | info | info_cols | info_vals | display_value

----------+---------------+---------------+---------------------+----------------+--------------------+------+-----------+-----------+---------------
(0 rows)
 
id | name | enabled | template_id | agent_id | server_id | database_name | schema_name | package_name | object_name | params | operator | thresholds | check_frequency | history_retention
 | error_message | acknowledged | email_group_id | send_email | last_mail_send | flapping_detected | last_flapping_detection_processed | send_trap | snmp_trap_version | error_timestamp | low_send_trap | low_email_group_id | med_send_trap | med_email_group_id
 | high_send_trap | high_email_group_id | execute_script | execute_script_on_clear | execute_script_on_pem_server | script_code | submit_to_nagios | auto_created

----+------+---------+-------------+----------+-----------+---------------+-------------+--------------+-------------+--------+----------+------------+-----------------+-------------------+---------------+--------------+----------------+------------+----------------+-------------------+-----------------------------------+-----------+-------------------+-----------------+---------------+--------------------+---------------+--------------------+----------------+---------------------+----------------+-------------------------+------------------------------+-------------+------------------+--------------

Kind regards,

Paul-Emmanuel

EDB Team Member

Re: Error in PEM worker on PEM server

Hi Paul,

 

Thanks for the update. From the earlier communication, you have removed the corrupted rows and you are able to take dump of alert_status table.

 

Just to verify, if you have any other corruption present in the DB, could you please first perform the pg_dump of entire database cluster.

 

If there are no issues in the pg_dump and you still receive error in pemworker.log for alert id 6, could you please try restarting the PEM services in the below order :

 

1) PEM HTTPD

2) PEM Backend DB

3) PEM agent

 

As output shared by you, there is no entry for id 6, in both pem.alert and pem.alert_status, we want to verify if it still exists after the restart of the services.

 

Also, could you please let us know whether you have any backup of this PEM database cluster before the issue, as you have removed the corrupted rows from the cluster and to verify which rows from the table were removed, you can restore that backup and add it in current cluster.

 

Please let us know in case of any further issues.

 

Regards,

Sudhir Lonkar

Level 3 Adventurer

Re: Error in PEM worker on PEM server

Hi,

No corruption in pg_dump. I restart the 3 services in the order requested. There is still errors in worker.log but id have change since this morning 1776 that is not in pem.alert.

 

I 'm wondering if we can cleanning or purging all alerts in one command. I don't need those old alerts.

 

Kind regards,

Paul-Emmanuel