[Avila] fatal ethernet error on gw2348-4
devel
devel at oberonwireless.com
Wed Oct 18 17:16:42 EDT 2006
Disregard my previous post. It does appear to be working has I have had no
problems for a few days now. Thanks.
Travis
----- Original Message -----
From: "devel" <devel at oberonwireless.com>
To: "Avila" <avila at lists.unixstudios.net>
Sent: Friday, October 13, 2006 11:03 AM
Subject: Re: [Avila] fatal ethernet error on gw2348-4
Subject: [Avila] > I have tried this patch, and I can say I haven't seen the error message
> in about 2 days. However, I am still having a problem with some clients
> not getting the response back because it still seems at times that the
> ethernet port is still not receiving.
>
> Before David Acker had mentioned any of this and found a possible
> solution I didn't even think that my problem was caused by the ethernet
> port/driver. My problem is I have an ethernet bridge with the wired
> ethernet port and a wifi card (madwifi driver) and in Access Point mode
> (Avila GW2348-4). And if I recall correctly, this problem started when we
> went to the BSP's 2.6 Linux kernel. I have been after madwifi for months,
> off and on, trying to resolve it thinking the madwifi driver was the
> culprit.
>
> Now with David's findings, it certainly makes sense to me that my
> problem is similar to his in the sense that a wifi client can associate,
> ping the AP, ping associated clients, but nothing on the other side of the
> AP. I have ethereal running on a server I am trying to ping from a wifi
> client. The server sees the ping request and sends its reply. Looks good
> there. But the client never sees the reply. I have put tcpdump on the AP
> and it never sees the server's reply either. So my guess is the ethernet
> port periodically stop receiving packets.
>
> So at the moment I'm not sure what to try. Maybe play with the source
> code and look at little closer at what David was working with. If anyone
> has any thoughts, please let me know. Thanks!
>
>
> Travis
>
>
>
> ----- Original Message -----
> From: "David Acker" <dacker at roinet.com>
> To: "Avila" <avila at lists.unixstudios.net>
> Sent: Wednesday, October 11, 2006 9:23 AM
> Subject: Re: [Avila] fatal ethernet error on gw2348-4
>
>
>>I found the issue. See the attached patch.
>>
>> The failure is due to
>> ixp400_xscale_sw/src/ethAcc/IxEthAccDataPlane.c::ixEthAccPortRxFreeReplenish(...)
>> thinking that the length is too small. See the following test:
>> if (IX_OSAL_MBUF_MLEN(buffer) < IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN)
>>
>> The length reported by IX_OSAL_MBUF_MLEN(buffer) is 0 and
>> IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN = 64.
>>
>> ixEthRxFrameProcess calls ixEthAccPortRxFreeReplenish and before the
>> call it tries to reset the length with the following line of code:
>> IX_OSAL_MBUF_MLEN(mbufPtr) = IX_OSAL_MBUF_PKT_LEN(mbufPtr) =
>> IX_OSAL_MBUF_ALLOCATED_BUFF_LEN(mbufPtr);
>>
>> Before this call to adjust the lengths I have seen the lengths reported
>> by IX_OSAL_MBUF_MLEN(mbufPtr) and IX_OSAL_MBUF_PKT_LEN(mbufPtr) be
>> various sizes although sometimes they are below 64 (I have seen as low
>> as 60). IX_OSAL_MBUF_ALLOCATED_BUFF_LEN(mbufPtr) is reporting 0 which
>> leads to IX_OSAL_MBUF_MLEN(mbufPtr) becoming 0 and the length test
>> failing.
>>
>> The packets that trigger this have the following flags set:
>> 0x8090
>> IX_ETHACC_NE_LINKMASK=0x01 - IEEE802.3 - Ethernet (Rx) / IEEE802.3 -
>> Ethernet (Tx)
>> IX_ETHACC_NE_FILTERMASK is set
>> IX_ETHACC_NE_NEWSRCMASK is set
>> or
>> 0x8080
>> IX_ETHACC_NE_LINKMASK=0x00 - IEEE802.3 - 8802 (Rx) / IEEE802.3 - 8802
>> (Tx)
>> IX_ETHACC_NE_FILTERMASK is set
>> IX_ETHACC_NE_NEWSRCMASK is set
>> It is the IX_ETHACC_NE_FILTERMASK that makes the driver try to replenish
>> the packet. According to the header IX_ETHACC_NE_FILTERMASK means:
>> * @brief This bit indicates whether a frame has been filtered by the
>> Rx service.
>> *
>> * This mask applies to @a IX_ETHACC_NE_FLAGS.
>> * Certain frames, which should normally be fully filtered by the NPE
>> to due
>> * the destination MAC address being on the same segment as the Rx port
>> are
>> * still forwarded to the XScale (although the payload is invalid) in
>> order
>> * to learn the MAC address of the transmitting station, if this is
>> unknown.
>> * Normally EthAcc will filter and recycle these framess internally and
>> no
>> * frames with the FILTER bit set will be received by the client.
>>
>> I suspect this is occurring because we are in promiscuous mode. The
>> reason it doesn't kill your ethernet immediately is because each message
>> is a leak of one buffer and there are RX_MBUF_POOL_SIZE or 80 of them
>> available. Once your run out, you can not receive but you can send.
>>
>> The bug occurs because the ethernet driver is creating the receive
>> buffer pool with a size of 0. Sadly the IX_OSAL_MBUF_POOL_INIT function
>> does not check the size passed in even though the replenish code has a
>> minimum size restriction. I changed the size passed in to
>> IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN or 64 and now this situation
>> seems to be handled properly.
>> -Ack
>>
>>
>> David Acker wrote:
>>> The port is not part of a bridge. It is in promiscuous mode and is
>>> receiving and sending a relatively large amount of traffic. I have a
>>> userspace program that has a raw socket bound to each port. I don't
>>> have any netfilter rules in place.
>>>
>>> I have done some more debugging and found some information. The failure
>>> is due to
>>> ixp400_xscale_sw/src/ethAcc/IxEthAccDataPlane.c::ixEthAccPortRxFreeReplenish(...)
>>> thinking that the length is too small. See the following test:
>>> if (IX_OSAL_MBUF_MLEN(buffer) < IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN)
>>>
>>> The length reported by IX_OSAL_MBUF_MLEN(buffer) is 0 and
>>> IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN = 64.
>>>
>>> ixEthRxFrameProcess calls ixEthAccPortRxFreeReplenish and before the
>>> call it tries to reset the length with the following line of code:
>>> IX_OSAL_MBUF_MLEN(mbufPtr) = IX_OSAL_MBUF_PKT_LEN(mbufPtr) =
>>> IX_OSAL_MBUF_ALLOCATED_BUFF_LEN(mbufPtr);
>>>
>>> Before this call to adjust the lengths I have seen the lengths reported
>>> by IX_OSAL_MBUF_MLEN(mbufPtr) and IX_OSAL_MBUF_PKT_LEN(mbufPtr) be
>>> various sizes although sometimes they are below 64 (I have seen as low
>>> as 60). IX_OSAL_MBUF_ALLOCATED_BUFF_LEN(mbufPtr) is reporting 0 which
>>> leads to IX_OSAL_MBUF_MLEN(mbufPtr) becoming 0 and the length test
>>> failing.
>>>
>>> I am still looking into when the allocated buffer length got corrupted.
>>> It could be a locking issue and/or a race condition that only happens
>>> at high speeds. With any luck I will know more by the end of the day.
>>> -Ack
>>>
>>> Dave G wrote:
>>>> Ack,
>>>>
>>>>
>>>> Is the affected port part of a bridge group? Do you have any netfilter
>>>> rules in place either using bridge nf or not?
>>>>
>>>>
>>>> -Dave
>>>>
>>>>
>>>>> Hello folks,
>>>>> We are running the .06 development kit on our gw2348-4 board. If we
>>>>> run it long enough with even small amounts of ethernet traffic we
>>>>> eventually get the following error:
>>>>>
>>>>> [fatal] ixEthRxFrameProcess: Failed to replenish with filtered frame
>>>>> on port 0
>>>>>
>>>>> After this error the port usually does not receive at all but can
>>>>> send. For example, when I ping an ethernet client that has ethereal
>>>>> on, I can see the ARP requests, and I can see the client send the ARP
>>>>> response, but the board's arp tables do not show ever getting the
>>>>> response. The port stays in this state through unplug/replug of the
>>>>> cable. Sometimes it stays in this state through a reboot command.
>>>>> It always clears up on a full power cycle. The other ethernet port
>>>>> will work when one port is in this state.
>>>>>
>>>>> Does anyone know what causes this error and how to fix it?
>>>>>
>>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: avila-unsubscribe at lists.unixstudios.net
>>> For additional commands, e-mail: avila-help at lists.unixstudios.net
>>>
>>
>>
>
>
> --------------------------------------------------------------------------------
>
>
>> --- snapgear/modules/ixp425/net-2.0/ixp400_eth.c.orig 2006-10-10
>> 16:54:17.000000000 -0400
>> +++ snapgear/modules/ixp425/net-2.0/ixp400_eth.c 2006-10-10
>> 14:07:54.000000000 -0400
>> @@ -3198,7 +3198,7 @@ static int __devinit dev_eth_probe(struc
>> TRACE;
>>
>> /* initialize RX pool */
>> - priv->rx_pool = IX_OSAL_MBUF_POOL_INIT(RX_MBUF_POOL_SIZE, 0,
>> + priv->rx_pool = IX_OSAL_MBUF_POOL_INIT(RX_MBUF_POOL_SIZE,
>> IX_ETHNPE_ACC_RXFREE_BUFFER_LENGTH_MIN,
>> "IXP400 Ethernet driver Rx Pool");
>> if(priv->rx_pool == NULL)
>> {
>>
>>
>
>
> --------------------------------------------------------------------------------
>
>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: avila-unsubscribe at lists.unixstudios.net
>> For additional commands, e-mail: avila-help at lists.unixstudios.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: avila-unsubscribe at lists.unixstudios.net
> For additional commands, e-mail: avila-help at lists.unixstudios.net
>
>
More information about the Avila
mailing list