[FEATURE] Share manager HA - Experimental

## Is your feature request related to a problem? Please describe (👍 if you like this request)

Longhorn share manager is the backend to serve RWX volume by using NFS ganesha. Because it's a single instance, it would be SPOF, even though it has been improved with the recovery backend mechanism at https://github.com/longhorn/longhorn/issues/2293. 

The goal is to make the share manager highly available to improve availability instead of just relying on a shorter recovery time which would be uncertain, really depending on different environmental factors. 

## Describe the solution you'd like

## Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

## Additional context

Before 1.4.4 and 1.5.2, there are some Kernel issues that could cause the volume node to get stuck during the node reboot or upgrade if the share manager pod is disconnected because we use hard mode NFS mount. Thus, to resolve this problem, the soft mode will be introduced back with a longer timeout to prevent this situation in 1.4.4, 1.5.2, and 1.6. The detailed context can be checked at https://github.com/longhorn/longhorn/issues/6655#issuecomment-1726948794. However, this could be a potential risk of data loss if the timeout is not well defined. (timeout should at least consider the pod eviction timeout)

Eventually, the hard mode will be readopted together with this feature. Still, it doesn't mean the node stuck situation will not be encountered, but it's just a very rare case at least only if share manager HA nodes are all down or pods all unavailable to lose the HA. 

cc @longhorn/dev-data-plane 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] Share manager HA - Experimental #6205

Is your feature request related to a problem? Please describe (👍 if you like this request)

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Share manager HA - Experimental #6205

Description

Is your feature request related to a problem? Please describe (👍 if you like this request)

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions