[Avila] fatal ethernet error on gw2348-4

devel devel at oberonwireless.com
Fri Oct 13 11:03:01 EDT 2006


    I have tried this patch, and I can say I haven't seen the error message 
in about 2 days. However, I am still having a problem with some clients not 
getting the response back because it still seems at times that the ethernet 
port is still not receiving.

    Before David Acker had mentioned any of this and found a possible 
solution I didn't even think that my problem was caused by the ethernet 
port/driver. My problem is I have an ethernet bridge with the wired ethernet 
port and a wifi card (madwifi driver) and in Access Point mode (Avila 
GW2348-4). And if I recall correctly, this problem started when we went to 
the BSP's 2.6 Linux kernel. I have been after madwifi for months, off and 
on, trying to resolve it thinking the madwifi driver was the culprit.

    Now with David's findings, it certainly makes sense to me that my 
problem is similar to his in the sense that a wifi client can associate, 
ping the AP, ping associated clients, but nothing on the other side of the 
AP. I have ethereal running on a server I am trying to ping from a wifi 
client. The server sees the ping request and sends its reply. Looks good 
there. But the client never sees the reply. I have put tcpdump on the AP and 
it never sees the server's reply either. So my guess is the ethernet port 
periodically stop receiving packets.

    So at the moment I'm not sure what to try. Maybe play with the source 
code and look at little closer at what David was working with. If anyone has 
any thoughts, please let me know. Thanks!

Travis

----- Original Message ----- 
From: "David Acker" <dacker at roinet.com>
To: "Avila" <avila at lists.unixstudios.net>
Sent: Wednesday, October 11, 2006 9:23 AM
Subject: Re: [Avila] fatal ethernet error on gw2348-4
Subject: [Avila] >I found the issue.  See the attached patch.
>
> The failure is due to
> ixp400_xscale_sw/src/ethAcc/IxEthAccDataPlane.c::ixEthAccPortRxFreeReplenish(...)
> thinking that the length is too small.  See the following test:
> if (IX_OSAL_MBUF_MLEN(buffer) < IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN)
>
> The length reported by IX_OSAL_MBUF_MLEN(buffer) is 0 and
> IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN = 64.
>
> ixEthRxFrameProcess calls ixEthAccPortRxFreeReplenish and before the
> call it tries to reset the length with the following line of code:
> IX_OSAL_MBUF_MLEN(mbufPtr) = IX_OSAL_MBUF_PKT_LEN(mbufPtr) =
> IX_OSAL_MBUF_ALLOCATED_BUFF_LEN(mbufPtr);
>
> Before this call to adjust the lengths I have seen the lengths reported
> by IX_OSAL_MBUF_MLEN(mbufPtr) and IX_OSAL_MBUF_PKT_LEN(mbufPtr) be
> various sizes although sometimes they are below 64 (I have seen as low
> as 60).  IX_OSAL_MBUF_ALLOCATED_BUFF_LEN(mbufPtr) is reporting 0 which
> leads to IX_OSAL_MBUF_MLEN(mbufPtr) becoming 0 and the length test 
> failing.
>
> The packets that trigger this have the following flags set:
> 0x8090
> IX_ETHACC_NE_LINKMASK=0x01 - IEEE802.3 - Ethernet (Rx) / IEEE802.3 -
> Ethernet (Tx)
> IX_ETHACC_NE_FILTERMASK is set
> IX_ETHACC_NE_NEWSRCMASK is set
> or
> 0x8080
> IX_ETHACC_NE_LINKMASK=0x00 - IEEE802.3 - 8802 (Rx) / IEEE802.3 - 8802 (Tx)
> IX_ETHACC_NE_FILTERMASK is set
> IX_ETHACC_NE_NEWSRCMASK is set
> It is the IX_ETHACC_NE_FILTERMASK that makes the driver try to replenish
> the packet.  According to the header IX_ETHACC_NE_FILTERMASK means:
>  * @brief This bit indicates whether a frame has been filtered by the
> Rx service.
>  *
>  * This mask applies to @a IX_ETHACC_NE_FLAGS.
>  * Certain frames, which should normally be fully filtered by the NPE
> to due
>  * the destination MAC address being on the same segment as the Rx port 
> are
>  * still forwarded to the XScale (although the payload is invalid) in 
> order
>  * to learn the MAC address of the transmitting station, if this is
> unknown.
>  * Normally EthAcc will filter and recycle these framess internally and no
>  * frames with the FILTER bit set will be received by the client.
>
> I suspect this is occurring because we are in promiscuous mode.  The
> reason it doesn't kill your ethernet immediately is because each message
> is a leak of one buffer and there are RX_MBUF_POOL_SIZE or 80 of them
> available.  Once your run out, you can not receive but you can send.
>
> The bug occurs because the ethernet driver is creating the receive
> buffer pool with a size of 0.  Sadly the IX_OSAL_MBUF_POOL_INIT function
> does not check the size passed in even though the replenish code has a
> minimum size restriction.  I changed the size passed in to
> IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN or 64 and now this situation
> seems to be handled properly.
> -Ack
>
>
> David Acker wrote:
>> The port is not part of a bridge.  It is in promiscuous mode and is
>> receiving and sending a relatively large amount of traffic.  I have a
>> userspace program that has a raw socket bound to each port.  I don't
>> have any netfilter rules in place.
>>
>> I have done some more debugging and found some information.  The failure
>>  is due to
>> ixp400_xscale_sw/src/ethAcc/IxEthAccDataPlane.c::ixEthAccPortRxFreeReplenish(...)
>> thinking that the length is too small.  See the following test:
>> if (IX_OSAL_MBUF_MLEN(buffer) < IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN)
>>
>> The length reported by IX_OSAL_MBUF_MLEN(buffer) is 0 and
>> IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN = 64.
>>
>> ixEthRxFrameProcess calls ixEthAccPortRxFreeReplenish and before the
>> call it tries to reset the length with the following line of code:
>> IX_OSAL_MBUF_MLEN(mbufPtr) = IX_OSAL_MBUF_PKT_LEN(mbufPtr) =
>> IX_OSAL_MBUF_ALLOCATED_BUFF_LEN(mbufPtr);
>>
>> Before this call to adjust the lengths I have seen the lengths reported
>> by IX_OSAL_MBUF_MLEN(mbufPtr) and IX_OSAL_MBUF_PKT_LEN(mbufPtr) be
>> various sizes although sometimes they are below 64 (I have seen as low
>> as 60).  IX_OSAL_MBUF_ALLOCATED_BUFF_LEN(mbufPtr) is reporting 0 which
>> leads to IX_OSAL_MBUF_MLEN(mbufPtr) becoming 0 and the length test 
>> failing.
>>
>> I am still looking into when the allocated buffer length got corrupted.
>>  It could be a locking issue and/or a race condition that only happens
>> at high speeds.  With any luck I will know more by the end of the day.
>> -Ack
>>
>> Dave G wrote:
>>> Ack,
>>>
>>>
>>> Is the affected port part of a bridge group? Do you have any netfilter
>>> rules in place either using bridge nf or not?
>>>
>>>
>>> -Dave
>>>
>>>
>>>> Hello folks,
>>>> We are running the .06 development kit on our gw2348-4 board.  If we
>>>> run it long enough with even small amounts of ethernet traffic we
>>>> eventually get the following error:
>>>>
>>>> [fatal] ixEthRxFrameProcess: Failed to replenish with filtered frame
>>>> on port 0
>>>>
>>>> After this error the port usually does not receive at all but can
>>>> send. For example, when I ping an ethernet client that has ethereal
>>>> on, I can see the ARP requests, and I can see the client send the ARP
>>>> response, but the board's arp tables do not show ever getting the
>>>> response.  The port stays in this state through unplug/replug of the
>>>> cable.  Sometimes it stays in this state through a reboot command.
>>>> It always clears up on a full power cycle.  The other ethernet port
>>>> will work when one port is in this state.
>>>>
>>>> Does anyone know what causes this error and how to fix it?
>>>>
>>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: avila-unsubscribe at lists.unixstudios.net
>> For additional commands, e-mail: avila-help at lists.unixstudios.net
>>
>
>

--------------------------------------------------------------------------------

> --- snapgear/modules/ixp425/net-2.0/ixp400_eth.c.orig 2006-10-10 
> 16:54:17.000000000 -0400
> +++ snapgear/modules/ixp425/net-2.0/ixp400_eth.c 2006-10-10 
> 14:07:54.000000000 -0400
> @@ -3198,7 +3198,7 @@ static int __devinit dev_eth_probe(struc
>     TRACE;
>
>     /* initialize RX pool */
> -    priv->rx_pool = IX_OSAL_MBUF_POOL_INIT(RX_MBUF_POOL_SIZE, 0,
> +    priv->rx_pool = IX_OSAL_MBUF_POOL_INIT(RX_MBUF_POOL_SIZE, 
> IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN,
>             "IXP400 Ethernet driver Rx Pool");
>     if(priv->rx_pool == NULL)
>     {
>
>

--------------------------------------------------------------------------------

>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: avila-unsubscribe at lists.unixstudios.net
> For additional commands, e-mail: avila-help at lists.unixstudios.net 





More information about the Avila mailing list