We’ve had a few Netapp issues since we went live. One of them was that we were shipped the wrong card, and so we had to reconfigure the nodes with the proper 10GE fiber cards right before we went live. We weren’t aware of the cluster requirements that you had to set the Partner IP. Last night we rebooted both nodes and did a failover test after adding in the partner IPs into the multiple interfaces we are using. The failover worked great, and everything is good.
Every Sunday we have also been seeing a degrading performance issue on one of the filers, it starts out by some pack loss over the LAN, and cascades down to the filer eventually either being rebooted or going down to ping. This effects the 1g interfaces and the 10g interfaces as well. This filer is serving NFS for vmware, as well as CIFS for standard fileserving to a farm of webservers.
I’ve had a case open with Netapp, but the response time of the engineers has been lackluster, which is surprising since we have 1 outage per week on this node. I just noticed yesterday that we were seeing errors on the filer 10g interface (only 1 of them) but after the reboot there were none. The switch wasn’t seeing any errors, only the filer.
Data has been changed below (Network and Address):
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Collis Queue
e0a* 1500 none none 0 0 0 0 0 0
e0b 1500 10.10.5/24 *SNIP* 1m 0 1m 0 0 0
e0c* 1500 none none 0 0 0 0 0 0
e0d 1500 10.10.3/24 *SNIP* 25m 0 16m 0 0 0
e2a 1500 10.10.5/24 *SNIP* 38m 2m 3m 0 0 0
e2b 1500 10.10.2/24 *SNIP* 65k 0 18m 0 0 0
lo 8160 127 localhost 30k 0 30k 0 0 0
Every Sunday we have also been seeing a degrading performance issue on one of the filers, it starts out by some pack loss over the LAN, and cascades down to the filer eventually either being rebooted or going down to ping. This effects the 1g interfaces and the 10g interfaces as well. This filer is serving NFS for vmware, as well as CIFS for standard fileserving to a farm of webservers.
I’ve had a case open with Netapp, but the response time of the engineers has been lackluster, which is surprising since we have 1 outage per week on this node. I just noticed yesterday that we were seeing errors on the filer 10g interface (only 1 of them) but after the reboot there were none. The switch wasn’t seeing any errors, only the filer.
Data has been changed below (Network and Address):
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Collis Queue
e0a* 1500 none none 0 0 0 0 0 0
e0b 1500 10.10.5/24 *SNIP* 1m 0 1m 0 0 0
e0c* 1500 none none 0 0 0 0 0 0
e0d 1500 10.10.3/24 *SNIP* 25m 0 16m 0 0 0
e2a 1500 10.10.5/24 *SNIP* 38m 2m 3m 0 0 0
e2b 1500 10.10.2/24 *SNIP* 65k 0 18m 0 0 0
lo 8160 127 localhost 30k 0 30k 0 0 0
Comments