Thursday, July 6, 2017

Configuration Sync Failure Between a Cisco ASA Active and Standby Firewall

I had an issue wherein a Standby firewall was unable to sync its configuration with the Active firewall. But when I further checked, the Active firewall's running config got replicated to the Standby firewall even with the error displayed. There's no IOS bug when I consulted with Cisco TAC. A quick fix to this problem is just to reboot the firewall pair in order to restart the synchronization process starting with the Standby and then the Active.


ciscoasa/pri/stby# show version

Cisco Adaptive Security Appliance Software Version 9.1(3)
Device Manager Version 7.1(3)

ciscoasa/pri/stby#
    Unable to sync configuration from Active
.

    Detected an Active mate


ciscoasa/pri/stby# show failover
Failover On
Failover unit Primary
Failover LAN Interface: folink GigabitEthernet0/7 (up)
Unit Poll frequency 1 seconds, holdtime 15 seconds
Interface Poll frequency 5 seconds, holdtime 25 seconds
Interface Policy 1
Monitored Interfaces 6 of 216 maximum
Version: Ours 9.1(3), Mate 9.1(3)
Last Failover at: 14:39:20 UTC May 15 2017
    This host: Primary - Sync Config
        Active time: 0 (sec)
        slot 0: ASA5525 hw/sw rev (1.0/9.1(3)) status (Up Sys)

<SNIP>

    Other host: Secondary - Active
        Active time: 36400816 (sec)
        slot 0: ASA5525 hw/sw rev (1.0/9.1(3)) status (Up Sys)


ciscoasa/pri/stby# show failover state

               State          Last Failure Reason      Date/Time
This host  -   Primary
               Sync Config    None
Other host -   Secondary
               Active         None

====Configuration State===
====Communication State===


ciscoasa/pri/stby# show failover history
==========================================================================
From State                 To State                   Reason
==========================================================================
02:50:54 UTC May 19 2017
Negotiation                Cold Standby               Detected an Active mate

02:50:56 UTC May 19 2017
Cold Standby               Sync Config                Detected an Active mate

02:52:58 UTC May 19 2017
Sync Config                Negotiation                HA state progression failed

02:53:00 UTC May 19 2017
Negotiation                Cold Standby               Detected an Active mate

02:53:02 UTC May 19 2017
Cold Standby               Sync Config                Detected an Active mate

02:55:04 UTC May 19 2017
Sync Config                Negotiation                HA state progression failed


The firewall sub-interface status were also down.

ciscoasa/pri/stby# show interface ip br ief
Interface                  IP-Address      OK? Method Status                Protocol
GigabitEthernet0/0         unassigned      YES unset  up                    up 
GigabitEthernet0/0.6      10.4.0.5      YES CONFIG down                  down


Below were the troubleshooting done on the Secondary ASA device which is currently acting as the Active firewall.


ciscoasa/sec/act# show version

Cisco Adaptive Security Appliance Software Version 9.1(3)
Device Manager Version 7.1(3)


ciscoasa/sec/act# write standby
Building configuration...

Config replication in progress.... Please try later
[FAILED]

ssh: Write standby failure Case 1

ciscoasa/sec/act# show failover state

               State          Last Failure Reason      Date/Time
This host  -   Secondary
               Active         None
Other host -   Primary
               Sync Config    Comm Failure             02:41:03 UTC May 15 2017

====Configuration State===
        Config Syncing
        Sync Done - STANDBY
====Communication State===


I reload the Standby unit first but it was still unable to sync with the Active unit.


ciscoasa/pri/stby# reload
System config has been modified. Save? [Y]es/[N]o: 
Cryptochecksum: 3cf4b55b c90b56ed 2ef4ec53 2a982127

46859 bytes copied in 0.660 secs
Proceed with reload? [confirm] 


***
*** --- START GRACEFUL SHUTDOWN ---
Shutting down isakmp
Shutting down webvpn
Shutting down sw-module
Shutting down License Controller
Shutting down File system


***
*** --- SHUTDOWN NOW ---
Process shutdown finished
Rebooting.....
Cisco BIOS Version:9B2C109A
Build Date:05/15/2013 16:34:44

<SNIP>

ciscoasa/pri/stby# show failover
Failover On
Failover unit Primary
Failover LAN Interface: folink GigabitEthernet0/7 (up)
Unit Poll frequency 1 seconds, holdtime 15 seconds
Interface Poll frequency 5 seconds, holdtime 25 seconds
Interface Policy 1
Monitored Interfaces 6 of 216 maximum
Version: Ours 9.1(3), Mate 9.1(3)
Last Failover at: 03:04:11 UTC May 23 2017
    This host: Primary - Sync Config
        Active time: 0 (sec)
        slot 0: ASA5525 hw/sw rev (1.0/9.1(3)) status (Up Sys)
    
<OUTPUT TRUNCATED>

    Other host: Secondary - Active
        Active time: 36746504 (sec)
        slot 0: ASA5525 hw/sw rev (1.0/9.1(3)) status (Up Sys)

<OUTPUT TRUNCATED>

    Unable to sync configuration from Active
.

    Detected an Active mate


I've reload the Active unit remotely but got stuck with the GRACEFUL SHUTDOWN output. I tried to do a force reload but got stuck with the SHUTDOWN NOW output.


ciscoasa/sec/act# reload
Proceed with reload? [confirm]

***
*** --- START GRACEFUL SHUTDOWN ---          

ciscoasa/sec/act# show reload
No reload is scheduled.

ciscoasa/sec/act# reload noconfirm
ciscoasa/sec/act# show reload
No reload is scheduled.

ciscoasa/sec/act# reload ?

  at             Reload at a specific time/date
  cancel         Cancel a scheduled reload
  in             Reload after a time interval
  max-hold-time  Maximum hold time for orderly reload
  noconfirm      Reload without asking for confirmation
  quick          Quick reload without properly shutting down each subsystem
  reason         Reason for reload
  save-config    Save configuration before reload
  <cr>

ciscoasa/sec/act# reload quick ?

  at             Reload at a specific time/date
  in             Reload after a time interval
  max-hold-time  Maximum hold time for orderly reload
  noconfirm      Reload without asking for confirmation
  reason         Reason for reload
  save-config    Save configuration before reload
  <cr>

ciscoasa/sec/act# reload quick noconfirm

***
*** --- SHUTDOWN NOW ---

My remote out-of-band (OOB) console session got stuck on this output so I did a 'hard' reload on the Active firewall. After the 'hard' reload on the Secondary Active firewall, the network traffic failover to the Primary firewall. The Primary firewall also took over the Active role. I was also able to issue a write standby on the Active firewall and the configuration was automatically sync'd to the Standby firewall. I ran some debugs to capture synchronization and replication messages.

ciscoasa/pri/stby# debug fover cable
ciscoasa/pri/stby# debug fover sync
fover event trace on
ciscoasapri/stby# fover_health_monitoring_thread: fover_lan_check() Failover LAN Check

    Unable to sync configuration from Active
.

    No Active mate detected

    Switching to Active
Failover LAN Failed


ciscoasa/pri/act#
Failover LAN became OK
Switchover enabled

Beginning configuration replication: Sending to mate.
fover_rep: frep_write_one_cmd: Cmd: :
fover_rep: frep_write_one_cmd: Cmd:  Written by enable_1 at
fover_rep: frep_write_one_cmd: Cmd: 03:28:34.147 UTC Tue May 23 2017
fover_rep: frep_write_one_cmd: Cmd: !
fover_rep: frep_write_one_cmd: Cmd: ASA Version 9.1(3)
fover_rep: frep_write_one_cmd: Cmd: ASA Version 9.1(3)
fover_rep: frep_write_one_cmd: Cmd: !


<OUTPUT TRUNCATED>


ciscoasa/pri/act# show failover
Failover On
Failover unit Primary
Failover LAN Interface: folink GigabitEthernet0/7 (up)
Unit Poll frequency 1 seconds, holdtime 15 seconds
Interface Poll frequency 5 seconds, holdtime 25 seconds
Interface Policy 1
Monitored Interfaces 6 of 216 maximum
Version: Ours 9.1(3), Mate 9.1(3)
Last Failover at: 03:23:56 UTC May 23 2017
    This host: Primary - Active
        Active time: 575 (sec)
       
<OUTPUT TRUNCATED>

    Other host: Secondary - Standby Ready
        Active time: 0 (sec)
        slot 0: ASA5525 hw/sw rev (1.0/9.1(3)) status (Up Sys)


ciscoasa/pri/act# show failover state

               State          Last Failure Reason      Date/Time
This host  -   Primary
               Active         None
Other host -   Secondary
               Standby Ready  Comm Failure             03:24:11 UTC May 23 2017

====Configuration State===
    Sync Done
====Communication State===
    Mac set


ciscoasa/pri/act# write standby
Building configuration...
[OK]
ciscoasa/pri/act# Beginning configuration replication: Sending to mate.
End Configuration Replication to mate


ciscoasa/sec/stby# show version

Cisco Adaptive Security Appliance Software Version 9.1(3)
Device Manager Version 7.1(3)

Compiled on Mon 16-Sep-13 16:07 PDT by builders
System image file is "disk0:/asa913-smp-k8.bin"
Config file at boot was "startup-config"

ciscoasa up 3 mins 18 secs
failover cluster up 27 mins 22 secs

<OUTPUT TRUNCATED>


ciscoasa/sec/stby# show failover
Failover On
Failover unit Secondary
Failover LAN Interface: folink GigabitEthernet0/7 (up)
Unit Poll frequency 1 seconds, holdtime 15 seconds
Interface Poll frequency 5 seconds, holdtime 25 seconds
Interface Policy 1
Monitored Interfaces 6 of 216 maximum
Version: Ours 9.1(3), Mate 9.1(3)
Last Failover at: 03:28:18 UTC May 23 2017
    This host: Secondary - Standby Ready
        Active time: 0 (sec)
        slot 0: ASA5525 hw/sw rev (1.0/9.1(3)) status (Up Sys)
 
<OUTPUT TRUNCATED>

    Other host: Primary - Active
        Active time: 462 (sec)
        slot 0: ASA5525 hw/sw rev (1.0/9.1(3)) status (Up Sys)


The sub-interface on the Standby firewall also went UP.

ciscoasa/sec/stby# show interface ip brief
Interface                  IP-Address      OK? Method Status                Protocol
GigabitEthernet0/0         unassigned      YES unset  up                    up 
GigabitEthernet0/0.6      10.4.0.6      YES CONFIG up                    up