[Avila] fatal ethernet error on gw2348-4
David Acker
dacker at roinet.com
Wed Oct 11 09:23:59 EDT 2006
--------------040309080105010001010607
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
I found the issue. See the attached patch.
The failure is due to
ixp400_xscale_sw/src/ethAcc/IxEthAccDataPlane.c::ixEthAccPortRxFreeReplenish(...)
thinking that the length is too small. See the following test:
if (IX_OSAL_MBUF_MLEN(buffer) < IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN)
The length reported by IX_OSAL_MBUF_MLEN(buffer) is 0 and
IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN = 64.
ixEthRxFrameProcess calls ixEthAccPortRxFreeReplenish and before the
call it tries to reset the length with the following line of code:
IX_OSAL_MBUF_MLEN(mbufPtr) = IX_OSAL_MBUF_PKT_LEN(mbufPtr) =
IX_OSAL_MBUF_ALLOCATED_BUFF_LEN(mbufPtr);
Before this call to adjust the lengths I have seen the lengths reported
by IX_OSAL_MBUF_MLEN(mbufPtr) and IX_OSAL_MBUF_PKT_LEN(mbufPtr) be
various sizes although sometimes they are below 64 (I have seen as low
as 60). IX_OSAL_MBUF_ALLOCATED_BUFF_LEN(mbufPtr) is reporting 0 which
leads to IX_OSAL_MBUF_MLEN(mbufPtr) becoming 0 and the length test failing.
The packets that trigger this have the following flags set:
0x8090
IX_ETHACC_NE_LINKMASK=0x01 - IEEE802.3 - Ethernet (Rx) / IEEE802.3 -
Ethernet (Tx)
IX_ETHACC_NE_FILTERMASK is set
IX_ETHACC_NE_NEWSRCMASK is set
or
0x8080
IX_ETHACC_NE_LINKMASK=0x00 - IEEE802.3 - 8802 (Rx) / IEEE802.3 - 8802 (Tx)
IX_ETHACC_NE_FILTERMASK is set
IX_ETHACC_NE_NEWSRCMASK is set
It is the IX_ETHACC_NE_FILTERMASK that makes the driver try to replenish
the packet. According to the header IX_ETHACC_NE_FILTERMASK means:
* @brief This bit indicates whether a frame has been filtered by the
Rx service.
*
* This mask applies to @a IX_ETHACC_NE_FLAGS.
* Certain frames, which should normally be fully filtered by the NPE
to due
* the destination MAC address being on the same segment as the Rx port are
* still forwarded to the XScale (although the payload is invalid) in order
* to learn the MAC address of the transmitting station, if this is
unknown.
* Normally EthAcc will filter and recycle these framess internally and no
* frames with the FILTER bit set will be received by the client.
I suspect this is occurring because we are in promiscuous mode. The
reason it doesn't kill your ethernet immediately is because each message
is a leak of one buffer and there are RX_MBUF_POOL_SIZE or 80 of them
available. Once your run out, you can not receive but you can send.
The bug occurs because the ethernet driver is creating the receive
buffer pool with a size of 0. Sadly the IX_OSAL_MBUF_POOL_INIT function
does not check the size passed in even though the replenish code has a
minimum size restriction. I changed the size passed in to
IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN or 64 and now this situation
seems to be handled properly.
-Ack
David Acker wrote:
> The port is not part of a bridge. It is in promiscuous mode and is
> receiving and sending a relatively large amount of traffic. I have a
> userspace program that has a raw socket bound to each port. I don't
> have any netfilter rules in place.
>
> I have done some more debugging and found some information. The failure
> is due to
> ixp400_xscale_sw/src/ethAcc/IxEthAccDataPlane.c::ixEthAccPortRxFreeReplenish(...)
> thinking that the length is too small. See the following test:
> if (IX_OSAL_MBUF_MLEN(buffer) < IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN)
>
> The length reported by IX_OSAL_MBUF_MLEN(buffer) is 0 and
> IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN = 64.
>
> ixEthRxFrameProcess calls ixEthAccPortRxFreeReplenish and before the
> call it tries to reset the length with the following line of code:
> IX_OSAL_MBUF_MLEN(mbufPtr) = IX_OSAL_MBUF_PKT_LEN(mbufPtr) =
> IX_OSAL_MBUF_ALLOCATED_BUFF_LEN(mbufPtr);
>
> Before this call to adjust the lengths I have seen the lengths reported
> by IX_OSAL_MBUF_MLEN(mbufPtr) and IX_OSAL_MBUF_PKT_LEN(mbufPtr) be
> various sizes although sometimes they are below 64 (I have seen as low
> as 60). IX_OSAL_MBUF_ALLOCATED_BUFF_LEN(mbufPtr) is reporting 0 which
> leads to IX_OSAL_MBUF_MLEN(mbufPtr) becoming 0 and the length test failing.
>
> I am still looking into when the allocated buffer length got corrupted.
> It could be a locking issue and/or a race condition that only happens
> at high speeds. With any luck I will know more by the end of the day.
> -Ack
>
> Dave G wrote:
>> Ack,
>>
>>
>> Is the affected port part of a bridge group? Do you have any netfilter
>> rules in place either using bridge nf or not?
>>
>>
>> -Dave
>>
>>
>>> Hello folks,
>>> We are running the .06 development kit on our gw2348-4 board. If we
>>> run it long enough with even small amounts of ethernet traffic we
>>> eventually get the following error:
>>>
>>> [fatal] ixEthRxFrameProcess: Failed to replenish with filtered frame
>>> on port 0
>>>
>>> After this error the port usually does not receive at all but can
>>> send. For example, when I ping an ethernet client that has ethereal
>>> on, I can see the ARP requests, and I can see the client send the ARP
>>> response, but the board's arp tables do not show ever getting the
>>> response. The port stays in this state through unplug/replug of the
>>> cable. Sometimes it stays in this state through a reboot command.
>>> It always clears up on a full power cycle. The other ethernet port
>>> will work when one port is in this state.
>>>
>>> Does anyone know what causes this error and how to fix it?
>>>
>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: avila-unsubscribe at lists.unixstudios.net
> For additional commands, e-mail: avila-help at lists.unixstudios.net
>
--------------040309080105010001010607
Content-Type: text/x-patch;
name="replenishBuffer.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="replenishBuffer.diff"
--- snapgear/modules/ixp425/net-2.0/ixp400_eth.c.orig 2006-10-10 16:54:17.000000000 -0400
+++ snapgear/modules/ixp425/net-2.0/ixp400_eth.c 2006-10-10 14:07:54.000000000 -0400
@@ -3198,7 +3198,7 @@ static int __devinit dev_eth_probe(struc
TRACE;
/* initialize RX pool */
- priv->rx_pool = IX_OSAL_MBUF_POOL_INIT(RX_MBUF_POOL_SIZE, 0,
+ priv->rx_pool = IX_OSAL_MBUF_POOL_INIT(RX_MBUF_POOL_SIZE, IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN,
"IXP400 Ethernet driver Rx Pool");
if(priv->rx_pool == NULL)
{
--------------040309080105010001010607--
More information about the Avila
mailing list