[olug] Fails Over, but does Not Fail Back
Joel B
joel at kansaslinuxfest.org
Tue Apr 27 08:05:42 CDT 2021
Hi Rob,
I've seen this both ways and it seems to be dependent upon the equipment.
Examples:
The firewall units we have at my work run in a cluster (active-passive).
They do not "fail back", but the vendor explains this as one data points
used in deciding which is "active" is the uptime of the device (longer
uptime weights the device more likely to be the "active" unit). I just
reboot the now-active unit to restore the original order. (often this
happens during a maintenance window, so it's a quick check and no
problem rebooting).
In our networking switches (different vendor than the firewall units) we
use MSTP (Spanning-Tree). The links & switches have a priority settings
set that are not dependent upon device uptime, so if a "spanning-tree
event" occurs (link/switch/etc failure) when things recover they restore
to the desired setup based on those priorities. No extra intervention
required.
So i see it happen both ways.
-Joel
On 4/27/2021 3:45 AM, Rob Townley wrote:
> tldr; Systems that reliably fail over to redundant system, but absolutely
> refuses to revert back to primary system.
>
> Looking for general guidelines on systems (primarily networking) to
> troubleshoot the fail back to primary pathway.
>
> The failover happens reliably. The problem is when the primary comes
> back up, actually reverting back, aka “Failing Back” to the primary path.
>
> Have experienced this failure to fail back too many times across a variety
> of equipment and systems. Looking for general guidelines. What do noobs
> usually miss?
>
> Also, is it a common problem or just me?
> _______________________________________________
> OLUG mailing list
> OLUG at olug.org
> https://www.olug.org/mailman/listinfo/olug
More information about the OLUG
mailing list