AMD Alleges Intel Compilers Create Slower AMD Code 912
edxwelch writes "In AMD's recient anti-trust
lawsuit
AMD have examined the Intel compiler and found that it deliberatly runs code slower when it detects that the processor is an AMD.
"To achieve this, Intel designed the compiler to compile code
along several alternate code paths. ... By design, the
code paths were not created equally. If the program detects a "Genuine Intel" microprocessor,
it executes a fully optimized code path and operates with the maximum efficiency. However,
if the program detects an "Authentic AMD" microprocessor, it executes a different code path
that will degrade the program's performance or cause it to crash.""
How can this be done? (Score:4, Interesting)
Wow (Score:2, Interesting)
If true - and that's a big 'if' - I know a lot a lot of people who will soon stop using Intel compilers. This could lead to some significant changes across a large portion of the gaming industry, for starters.
--Ryv
So.. wait (Score:1, Interesting)
If the latter... wow.
Between this and exactly how brazen it looks like their contract abuse was, you'd think Intel could have learned to be less baldfaced about this. Even Microsoft hasn't been this unsubtle since the 80s.
Bastards. (Score:2, Interesting)
I'm glad AMD is pursuing this action against Intel just because I like rooting for underdogs, but this lends them the moral high ground they might have been seen to be lacking by some in the tech media.
Where is all this going (Score:4, Interesting)
It's true--and they know about it (Score:5, Interesting)
On any non-Intel processors, it specifically included an alternate code path for "memcpy" that actually used "rep movsb" to copy one byte at a time, instead of (for example) "rep movsd" to copy a doubleword at a time (or MMX instructions to copy quadwords). This was probably the most brain-dead memcpy I'd ever seen, and was around 4X slower than even a typical naive assembly memcpy:
push ecx
shr ecx, 2
rep movsd
pop ecx
and ecx, 3
rep movsb
They responded with completely ridiculous answers, such as:
"Our 8.0 memcpy was indeed optimized for a Pentium(r)4 Processor,when we reworked this routine we used the simplest, most robust, and straightforward implementation for older processors so that we didn't need the extra code to check for alignment, length, overlap, and other conditions."
BS. I went and added the following line to the beginning of my source code:
extern "C" int __intel_cpu_indicator;
then I added:
__intel_cpu_indicator = -512;
to the "main" function.
This forced Intel C++ to use the "Pentium 4" memcpy regardless of which processor in in the machine. It turns out that their special "Pentium 4" memcpy which I tested thoroughly in all kinds of situations, and it worked perfectly fine on an AMD Athlon and a Pentium III. I pointed this out to them.
I received the following response:
"The fast mempcy is over 2000 lines of hand coded assembly, with lots of special cases where different code paths are chosen based on relative alignment of the source and destination.
I answered "No," saying that I needed support for AMD processors as well. I also gave them a copy of my own memcpy routine that was 50% faster than theirs--and just used MMX. They closed the support issue and did nothing to resolve it.
I switched back to Visual C++.
Re:Simply ludicrous (Score:5, Interesting)
Re:How can this be done? (Score:5, Interesting)
The company I work for submitted a few reports on this a few months ago, as I am sure did many others. I am very pleased to see them following up on it.
What's surprising about this? (Score:2, Interesting)
Is this a suprise? Does AMD have a case? (Score:3, Interesting)
M.A. Mortenson Co., Inc. v. Timberline Software Corp. the courts have held that if you accept the license, it's not their fault. Even if they knowingly produce a faulty product.
Is it dirty pool - sure is. Is it illegal? That remains to be seen. AMD most certainly has a firm ground to stand on when it comes to antitrust and Intel.
Re:Wouldn't We Notice It? (Score:4, Interesting)
However, I doubt AMD would claim this if it weren't true. They still have a business to run, unlike SCO, and don't want to damage their reputation.
I suspect that the "crash" part is more conjecture than fact, since the unoptimized code paths might be assumed to be lower quality in many ways. Or perhaps AMD found a single instance of a crash that occurs in one of the unoptimized code paths and even if Intel didn't mean for it to be there, they're still on the hook for that path in the first place.
I have a feeling AMD has been working on this lawsuit for several years, so it will be interesting when they do finally submit the evidence to the court.
omg! (Score:2, Interesting)
It makes a great deal of scense to me.
You know the in's and out's of your own product. You can optimize code for your product, why should they have to optimize code for someone else's product? Maybe that company should be writing their own compilers?
Are people really that lazy? Why do the companies that _MAKE_ money get fucked over? because the little companies that only went into business to get a CHUCK of that money, are mad they're not getting enough... Fuck 'em.
It's like putting a fence around your house. your protect your assests.
More Likely (Score:2, Interesting)
as someone who worked at a compiler company (Score:5, Interesting)
The engineers get the specs for the next version of the compiler. They also get a slew of bug reports from the last version. They have a short amount of time to impliment the new specs, and fix the bugs.
The bug reports will be something like, "on AMD processors when doing a memcopy with optimization xyz turned on, the processors mispredicts half the time. This makes it very slow."
The engineer in that case, turns the optimization off for that generated code, thereby 'fixing' the bug (but not really). It happens all the time.
It's not a nefarious plot, it's the same time crunch issue that every software engineer has to deal with.
Re:It is semi true (Score:1, Interesting)
It either compiles for a P4 or for x86. Since AMD falls into the category of "other" it uses a different memcpy which takes a lot longer. (roughly 2x - 4x as long).
This enables the "other" code to run in basically any 286+ x86 processor but makes P4 code fast.
It is not anti-competitive so much as not suited for the task of compiling not P4 targeted code.
Re:How can this be done? (Score:4, Interesting)
My company writes some code that depends on SSE2 instructions. We bought one AMD (64-bit) machine, but code was slower on Athlon64 3200 than on P4 3000 (under WinXP, not on some 64-bit system). We heavily depend on every tact, so I would really like to know how to trick Intel's compiler to believe that AMD is P4.
Intel Chip = No Sale (Score:2, Interesting)
Re:The Limit of Lawsuits (Score:2, Interesting)
Personally, I think this is a bit of a grey area. Obviously, it seems wrong that Intel should be crippling software, but at the same time, they aren't making anyone use that compiler in the way they are making people not sell AMD products (maybe I'm wrong, I didn't read that enourmous legal document). Ultimately, this whole thing is secondary to the monopolistic discount allegations, anyway, so it would be nothing more than icing if it's true. It does make for a nice "they're big meanies!" finger-pointing fest, though, huh?
Re:Simply ludicrous (Score:0, Interesting)
Re:It's true--and they know about it (Score:4, Interesting)
Re:Relevant Section (Score:2, Interesting)
This may be bullshit... (Score:2, Interesting)
Re:Instruction timing??? (Score:3, Interesting)
Though in this case the solution is simple. Don't buy Intel, don't use ICC. Usually on my P4 I can trick GCC [-fno-regmove comes up] to getting similar performance as ICC v8.
Even then, ICC has good schedulers but performs fewer higher level optimizations. So GCC is usually better in that respect.
Tom
Re:Another EXCELLENT reason to use open source.. (Score:3, Interesting)
Which, in this case, is really crappy performance.
There's really only one reason to use Intel's compiler -- for performance. It's well known that Intel's compiler generates code that vastly outperforms everything else for the same platform (namely Microsoft Visual C++ 6/7 and gcc -- everyone else (Watcomm, Borland) has long since been relegated to "also ran" status).
We're talking about a rather significant performance difference too -- 20% or more typically. Even more if you compare to gcc (x86 may be one of the most optimized target platforms for gcc/g++, but it's still got a long, long way to go comparatively; even to MSVC).
Intel also has one of the better C++ compliance records as I recall, although MSVC++ 7.0 pretty much eliminated that gap (gcc was far better than MSVC 6, they're roughly equal now), so that's another reason to use them.
But really it's all about the performance. If you have a product targeted for x86 and performance is one of your top criteria, then you'd be foolish to not consider using Intel's compiler for your builds. The reason not everyone does so is because it's expensive and the UI isn't as good as MSVC, particularly for debugging.
All of that said, the allegations are still damning. Yes, Intel has the right to tune the compiler for their CPUs. But if the alternate paths are coded stupidly or intentionally bad-case (worst case is not required) then they could be found to be engaging in anti-competitive behavior. Even if those code paths affect Intel processors as well -- it just has to affect only old Intel processors (hello upgrade!).
Additionally, you might be able to make a (weak) argument that using the "heavily optimized" path only on your own CPUs after having been informed that it works just fine on other CPUs is also anti-competitive. As stated, it's a very weak argument though, since if you do so you'd then have to test any changes you made in that particular heavily optimized code path on the other systems -- which you don't have as much knowledge on.
Of course, the question is why hasn't AMD come out with their own compiler? They should be dedicating resources in this direction instead of (or as well as) litigating Intel. If they don't want to build their own compiler, that's fine -- simply dedicate some time to helping gcc improve its low level compilation performance. I'd be surprised if the gcc x86 team wouldn't welcome any real support of that nature (and by that I mean an employee assigned to actually writing code, or at least a very direct line to AMD engineers that could help an existing gcc coder).
Re:It's true--and they know about it (Score:4, Interesting)
Re:Regulators Raid Intel Offices (Score:2, Interesting)
Re:Oh brother (Score:4, Interesting)
That's because you're trivializing the issue. Intel has a chip monopoly their compiler has a huge influence - even if it isn't used in production by the majority of their customers. Purposely reducing the efficiency for AMD chips is a great example of anti-competition, which is what monopoly laws are all about.
Work on GCC! (Score:5, Interesting)
Come on, AMD... If you do need to do your own compiler work, optimize GCC! The whole idea is to make code run fast on your chips, right? And think of the tremendous goodwill you'd build up, especially around here.
Re:Simply ludicrous (Score:5, Interesting)
http://validator.w3.org/check?uri=http%3A%2F%2Fww
Re:The Limit of Lawsuits (Score:1, Interesting)
Re:Another EXCELLENT reason to use open source.. (Score:3, Interesting)
Uhm, excuse me, but isn't the compiler assembly what is running?
And therefore you can inspect it using a debugger? Or by comparison with the output of an uncompromised compiler that does nearly but not exactly the same compilation methods used by the suspected one?
I think Thompson's point was that while inspecting the source code of the compiler will not reveal if the compiler is compromised, if you have the compiler output, you can still detect it.
This means you can certainly compromise any software if you don't have access to the source code, and if you have the source code, it could be harder. But if you have the output, you can certainly detect the compromise.
One way would be to run the same program through a compiler that is made by someone else which presumably does not use the same method of compromise and compare the output. It would be hard, but no harder I assume than detecting whether copyrighted code is included in some other software.
In fact, this is what Thompson actually said:
"You can't trust code that you did not totally create yourself. (Especially code from companies that employ people like me.) No amount of SOURCE-LEVEL [Emphasis added. RSH] verification or scrutiny will protect you from using untrusted code."
He did also say that the lower-level this sort of thing is done on, the harder it is to detect - it would be nearly impossible on the micro-code level. Which seems to support AMD's contention that the Intel modifications could be sabotage, not just conservatism in compiling for non-Intel processors.
In general, of course, while Thompson's point may be valid, it mostly applies to companies or hackers who may have a motivation - and more importantly, a reputation - to do something like this. It would be pointless for someone like GNU to do it. It would get out and it would damage their reputation.
While Thompson said no code other than that written by yourself can be trusted, I hardly think he requires everyone in the world to write their own assemblers, compilers, operating systems and applications (and design their own CPUs to avoid micro-code tampering). Given that, I'd say that open source is still far more likely to be trustworthy than closed-source.
Which renders the entire discussion here moot.
Re:It is semi true (Score:2, Interesting)
This is obviously nothing to do with the advantages of the processor. The only possible answer is that Intel is deliberately generating poor code for AMD's processors, in order to hamper their competitor. This is inexecusable.
Re:Regulators Raid Intel Offices (Score:3, Interesting)
AMD will probbaly be sued into oblivion
Simple Solution: WRITE YOUR OWN COMPILER!!! (Score:3, Interesting)
It seems to me that the obvious long-term solution for AMD is to write their own compiler.
And I've often thought the same of Novell - I always believed that one of the primary reasons NetWare foundered was because Novell never wrote their own compiler for the operating system. It was damned near impossible to write an NLM in the old days - you had to get a copy of Dr Watcom, and then do a bunch of undocumented wizardry just to get it to produce a simple "Hello World" output.
Anyway, for those of you computer establishments that lack your own in-house compiler, there's this cell phone company, called Motorola, which has pretty much ditched their chip fab subdivision, but which retains this little subsidiary called "Metrowerks", a subsidiary which doesn't seem to integrate very well with their forward-looking core strategy of providing the means to share Paris Hilton pr0n over hand-held cellular devices...
Re:The Limit of Lawsuits (Score:3, Interesting)
(rolls eyes) You again.
Frankly I am not into the compiler world (I'm no C/Fortran programmer), so I didn't expect that programs compiled with the Intel compiler would even try to work on an AMD CPU.
That would be a perfectly acceptable answer, and the one that AMD would like. However, the Intel compiler is not just producing highly optimized code and leaving it at that. Highly optimized code would work fine on an AMD CPU, partly because AMD has a technology cross-licensing contract with Intel. (Which means that Intel could produce AMD64 CPUs if they wanted!)
The core of the issue is that the code generated by the Intel compiler uses the slowest code path available if the CPU is an AMD. That's a potential Anti-trust violation, and smacks of desperation on Intel's part. I've always been overall happy with Intel's handling of their monopoly, but Moore is no longer at the helm and I fear that Intel may be slipping.
Re:The Limit of Lawsuits (Score:5, Interesting)
Re:Never (Score:3, Interesting)
IDT Winchip, Cyrix 486/133 and other oddball cpus?
My point is that I don't believe Intel intentionally broke the compiler for AMD, I think they took the approach of supporting their hardware fully, while using the _most compatible_ paths for everyone else. If that is what they did then AMD's argument is worthless. That doesn't change the end result, but it does insulate Intel from any legal wrongdoing.
-nB
Re:The Limit of Lawsuits (Score:3, Interesting)
There isn't anything really poorly designed about the SPARC, which isn't to say it doesn't have some things nobody would put in a new CPU. For example it is one of only 2 commercial RISC CPUs with register windows (the other being the ill-fated AMD 29k), so I doubt anyone would do that again. It has (optional) branch delay slots, which were a win for a few years, but on OOO CPUs you can get the same gain without the pain. It also has a MULSTEP instruction, which is pretty much a waste when you have the transistor budget for a real multiplier.
All of those can be forgiven as either actually a good idea in the late 80s, or at least not a known to be bad idea. Even the much maligned register stack was a pretty effective cache with way fewer transistors for a while there.
SPARC hasn't suffered because it was a crap design, it has suffered because it doesn't have the same volumes the x86 does, so Intel putting $0.02 per CPU back into R&D gets it a bigger R&D budget then Sun pouring $1000 per CPU back into R&D. Disclaimer: I made those numbers up. Totally invented.
Re:Compiler + host platform + target platform comb (Score:3, Interesting)
I resent that. I optimize my inner loops (and the outer loops, and even the startup initialization data is cache aligned...) and develop games and I use MS VC6 for Windows and GCC for Linux/*BSD* exclusively.
What sort of silly person would expect an INTEL compiler to generate decent AMD code anyways? While I didn't expect intentional sabotage, I'm not entirely surprised either. It's not like it's in Intel's best interest to spend millions on creating an optimizing AMD compiler.
Re:Metrowerks sold their x86 compiler assets. (Score:3, Interesting)
Re:Regulators Raid Intel Offices (Score:3, Interesting)
There was a stock split that took the price from $120 to $60 overnight. The drop to and stagnation at the $25 range for the past few years is not the result of the DOJ ruling, but reflect the fundamentals of Microsoft's stagnating business. Indeed, if the DOJ outcome had any effect, it would be to inflate the MSFT stock price because it demonstrated that the DOJ was impotent to stop Microsoft from further abusing its monopoly position.
Link (Score:5, Interesting)
Re:Simply ludicrous (Score:2, Interesting)
I think the statement was, "If you can't optimize it for Intel, then at least cripple it for AMD", or something to that effect.
I'm sure they do the same thing with their compiler and other things.
Re:Never (Score:3, Interesting)
Do you believe that it is pure coincidence that Intel's compiler produces fast code for their processor?
Do read the complaint! (Score:3, Interesting)
Re:You don't seem to undertand either (Score:2, Interesting)
The fact is, it is SUFFICIENT to only test for the presence of SSE/SSE2/SSE3 instruction set. Anything more, like "GENU" "INEI" "NTEL", is absolutely unnecessary. Afterall, why would Intel care if the code fails on non-Intel CPUs if ICC is only meant for Intel CPUs (which it isn't)? On the other hand, if the ICC is designed for other CPUs, then Intel obviously would know (afterall, Intel IS licensing the SSE/SSE2/SSE3 instruction set to AMD) the capabilities of said CPUs, in which case they should enable SSE/SSE2/SSE3 for CPUs capable of the instructions. Last I heard, there weren't any problems with AMD's SSE/SSE2/SSE3 implementation. Therefore, the bottom line is this boils down to a Catch22 situation and Intel should know better than to pull such cheap tricks. If Intel is going to assume that any CPU that implements the i386 instruction set does this without major problems, then why would they not assume the same with SSE/SSE2/SSE3?
Furthermore, if you actually read some of the user posts (and their links http://www.swallowtail.org/naughty-intel.html [swallowtail.org]), you would have realized that ICC deliberately produces segfault code when the execution of the code doesn't find an Intel CPU. According to the article, this is with the -xK flag. That is, it produces ONLY SSE code, and NO fallback to i386 code. Yes, that means this code will fail on the original Pentium. However, in this case, Pentium will still try to execute the SSE code (and subsequently craps out), while AMD CPUs will not (and automatically segfaults). This kinda throws the "and then does 'i386' code path, which is less optimal" argument out the window....