ArchiveOrangemail archive

debian-kernel.lists.debian.org


(List home) (Recent threads) (193 other Debian lists)

Subscription Options

  • RSS or Atom: Read-only subscription using a browser or aggregator. This is the recommended way if you don't need to send messages to the list. You can learn more about feed syndication and clients here.
  • Conventional: All messages are delivered to your mail address, and you can reply. To subscribe, send an email to the list's subscribe address with "subscribe" in the subject line.
  • Moderate traffic list: up to 30 messages per day
  • This list contains about 30,017 messages, beginning Jan 2011
  • 20 messages added yesterday
Report the Spam
This button sends a spam report to the moderator. Please use it sparingly. For other removal requests, read this.
Are you sure? yes no

Bug#600487: Invalid GART PTE entry errors during bulk data transfers

Ad
Jaap Hoetmer 1311521245Sun, 24 Jul 2011 15:27:25 +0000 (UTC)
Hello,

Recently I installed Debian Linux 6 (Squeeze, kernel 2.6.32-5-amd64 #1 
SMP) via netinst on an IBM eServer platform. The system has dual AMD 
Opteron processors.

While transferring lots of data from the original server this server was 
expected to replace, I noticed errors appearing repeatedly every 4 
minutes or so in the ssh sessions:

Message from syslogd@jupiter at Jul 24 07:30:07 ...
kernel:[43618.440106]  Northbridge Error, node 0

Message from syslogd@jupiter at Jul 24 07:30:07 ...
kernel:[43618.440304] Invalid GART PTE entry during table walk.

The errors appeared regularly, and it seemed only during very large data 
transfers across the network. As soon as the file transfers (using 
rsync) were completed, the errors stopped appearing. These messages show 
on all ssh sessions I had open to that server.

After some searching, I found a Linux kernel patch from Borislav Petkov 
at AMD where the exact error message was listed.
I also searched the Debian lists and found this bug report (600487) but 
that seemed related to X which I don't use on this particular machine, 
plus, the symptoms I see are triggered by data transfers via the network 
interface.

The following document from AMD however gave me the best information, 
but doesn't yet explain why the errors appear in the ssh sessions, much 
less why this appears during bulk data transfers. AMD states these 
messages should be suppressed.

http://support.amd.com/us/Processor_TechDocs/...

On Page 333 I read:------------------------------

12.10.1 GART Table Walk Error Reporting

This error is typically caused by a software graphics driver that 
improperly reserves or allocates aperture pages in the GART, resulting 
in benign visual artifacts
which are often undetected on other platforms.Setting MC4_CTL[10] allows 
software developers to
debug this error; the resulting benign machine check errors can, 
however, confuse an end user. For
this reason, AMD recommends that the BIOS developers disable this 
function by setting bit 10 of
MC4_CTL_MASK register (MSR C001_0048h) to a value of 1. This bit must be 
set before
MC4_CTL[10] bit is set. AMD also recommends adding a setup option to the 
BIOS setup menu. The
following should be displayed in the setup option:

Gart Table Walk Error MC reporting: Disabled/Enabled.

The default setting is disabled. The device driver developer may enable 
this function for
implementation and testing purposes. Also, a help message should be 
added with this setup option.
An example of the help message is:

This option should remain disabled for normal operation.
Ben Hutchings 1311534560Sun, 24 Jul 2011 19:09:20 +0000 (UTC)
On Sun, 2011-07-24 at 17:19 +0200, Jaap Hoetmer wrote:
> Hello, 
> 
> Recently I installed Debian Linux 6 (Squeeze, kernel 2.6.32-5-amd64 #1
> SMP) via netinst on an IBM eServer platform. The system has dual AMD
> Opteron processors. 
> 
> While transferring lots of data from the original server this server
> was expected to replace, I noticed errors appearing repeatedly every 4
> minutes or so in the ssh sessions: 
> 
> Message from syslogd@jupiter at Jul 24 07:30:07 ... 
> kernel:[43618.440106]  Northbridge Error, node 0 
> 
> Message from syslogd@jupiter at Jul 24 07:30:07 ... 
> kernel:[43618.440304] Invalid GART PTE entry during table walk.[...]

I think this should be fixed in 2.6.32-35.

Which package version do you have installed?

Ben.
Jaap Hoetmer 1311546438Sun, 24 Jul 2011 22:27:18 +0000 (UTC)
Thanks, Ben.
I am not too familiar with non-stock kernels. I did have a look at all 
the kernel.org release notes, but could not find the specific error in 
any of them. Anyway, as my note mentioned, I suspected it to be needing 
a later kernel and it didn't seem to have any negative impact on the 
running of the machine.
Where can I find the release notes of the 2.6.32-35 kernel?

Thanks, regards, JaapLe 24.07.2011 19:00, Ben Hutchings a écrit :
> [...]
> I think this should be fixed in 2.6.32-35.
>
> Which package version do you have installed?
>
> Ben.
>
Ben Hutchings 1311585864Mon, 25 Jul 2011 09:24:24 +0000 (UTC)
On Mon, Jul 25, 2011 at 12:23:13AM +0200, Jaap Hoetmer wrote:
> Thanks, Ben.
> I am not too familiar with non-stock kernels.I am referring to Debian kernel package version 2.6.32-35, which
is the current version in stable (Debian 6.0.2).  The command
'dpkg -s linux-image-2.6.32-5-amd64' will show the current package
version (among other things).

[...]
> Where can I find the release notes of the 2.6.32-35 kernel?In the Debian changelog.

Ben.
Bastian Blank 1311586693Mon, 25 Jul 2011 09:38:13 +0000 (UTC)
On Mon, Jul 25, 2011 at 10:21:31AM +0100, Ben Hutchings wrote:
> On Mon, Jul 25, 2011 at 12:23:13AM +0200, Jaap Hoetmer wrote:
> > Thanks, Ben.
> > I am not too familiar with non-stock kernels.
> I am referring to Debian kernel package version 2.6.32-35, which
> is the current version in stable (Debian 6.0.2).  The command
> 'dpkg -s linux-image-2.6.32-5-amd64' will show the current package
> version (among other things).And /proc/version contains informations about the running kernel.

Bastian-- 
We have found all life forms in the galaxy are capable of superior
development.
		-- Kirk, "The Gamesters of Triskelion", stardate 3211.7
Home | About | Privacy