topic:linux_repare_nagios_db

This is an old revision of the document!


Ремонит

Описание проблемы

Сломались таблицы в mysql базе в которую писались данные системы мониторинга Nagios.
Сам nagios работал, но в лог (/var/log/messages) огромными количествами сыпались подобные сообщения:

2017-04-11T13:49:25.025157+03:00 msk-02-noc ndo2db: mysql_error: 'Table './nagios_db/nagios_hoststatus' is marked as crashed and should be repaired'

2017-04-11T13:49:25.025167+03:00 msk-02-noc ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_hoststatus SET instance_id='1', host_object_id='207', status_update_time=FROM_UNIXTIME(1491907765), output='PING OK - Packet loss = 0%, RTA = 61\.61 ms', long_output='', perfdata='rta=61\.612000ms;3000\.000000;5000\.000000;0\.000000 pl=0%;80;100;0', current_state='0', has_been_checked='1', should_be_scheduled='1', current_check_attempt='1', max_check_attempts='3', last_check=FROM_UNIXTIME(1491907746), next_check=FROM_UNIXTIME(1491907807), check_type='0', last_state_change=FROM_UNIXTIME(1491247748), last_hard_state_change=FROM_UNIXTIME(1489995197), last_hard_state='0', last_time_up=FROM_UNIXTIME(1491907764), last_time_down=FROM_UNIXTIME(1491247676), last_time_unreachable=FROM_UNIXTIME(1491237594), state_type='1', last_notification=FROM_UNIXTIME(0), next_notification=FROM_UNIXTIME(0), no_more_notifications='0', notifications_enabled='1', problem_has_been_acknowledged='0', acknowledgement_type='0', current_notification_number='0', passive_checks_enabled='1', active_checks_enabled='1', event_handler_enabled='1', flap_detection_enabled='0', is_flapping='0', percent_state_change='0.000000', latency='6.762000', execution_time='2.420050', scheduled_downtime_depth='0', failure_prediction_enabled='1', process_performance_data='1', obsess_over_host='1', modified_host_attributes='0', event_handler='', check_command='check-host-alive', normal_check_interval='1.000000', retry_check_interval='0.100000', check_timeperiod_object_id='2' ON DUPLICATE KEY UPDATE instance_id='1', host_object_id='207', status_update_time=FROM_UNIXTIME(1491907765), output='PING OK - Packet loss = 0%, RTA = 61\.61 ms', long_output='', perfdata='rta=61\.612000ms;3000\.000000;5000\.000000;0\.000000 pl=0%;80;100;0', current_state='0', has_been_checked='1', should_be_scheduled='1', current_check_attempt='1', max_check_attempts='3', last_check=FROM_UNIXTIME(1491907746), next_check=FROM_UNIXTIME(1491907807), check_type='0', last_state_change=FROM_UNIXTIME(1491247748), last_hard_state_change=FROM_UNIXTIME(1489995197), last_hard_state='0', last_time_up=FROM_UNIXTIME(1491907764), last_time_down=FROM_UNIXTIME(1491247676), last_time_unreachable=FROM_UNIXTIME(1491237594), state_type='1', last_notification=FROM_UNIXTIME(0), next_notification=FROM_UNIXTIME(0), no_more_notifications='0', notifications_enabled='1', problem_has_been_acknowledged='0', acknowledgement_type='0', current_notification_number='0', passive_checks_enabled='1', active_checks_enabled='1', event_handler_enabled='1', flap_detection_enabled='0', is_flapping='0', percent_state_change='0.000000', latency='6.762000', execution_time='2.420050', scheduled_downtime_depth='0', failure_prediction_enabled='1', process_performance_data='1', obsess_over_host='1', modified_host_attributes='0', event_handler='', check_command='check-host-alive', normal_check_interval='1.000000', retry_check_interval='0.100000', check_timeperiod_object_id='2''

С помощью awk ищем сломашиеся таблицы.

[root@msk-02-noc log]# grep  'crashed and should be repaired' mysqld.log | awk '{print $6}' | sort | uniq
'./nagios_db/nagios_hoststatus'
'./nagios_db/nagios_servicechecks'
'./nagios_db/nagios_systemcommands'
'./nagios_db/nagios_timedeventqueue'
topic/linux_repare_nagios_db.1497216077.txt.gz · Last modified: (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki