days involving letters from Intel employees. If you've been living under a
rock, check out Kyle's OCP and read up. While I'm not going to say, "All this
stuff about bus-locking is a bunch of crap!" or "AMD and Intel would never work
together!" there are some other issues in these letters that make me question
how aware these people are of what's really going on, not only inside Intel but
out in the real world.
The first statement that I have a major problem with is the following by Karl
Andrews:
Also, faster chips can be sold for higher prices, right? When we test
manufacturing batches, we sort them by maximum reliable speed. If a 333 MHz
chip was capable of running reliably at 350 or 400, don't you think we would be
selling it at that speed, with it's correspondingly higher price? Whatever you
may think of Intel, we aren't stupid.
Intel may not be stupid, but I have to wonder about someone who ridicules a
practice Intel has been blatanty engagning in for several months now. Faster
chips can be sold for higher prices--but if there are more of those faster,
more expensive chips than there are people who want to pay tons of cash for
them, you end up with a glut. Intel has made no secret about its yields
exceeding even its own rosy expectations. Intel announced recently that it was
pushing up its roadmap because things are just so peachy.
In fact, it's common knowledge around the industry that things are going almost
too well. Unlike before, when the fastest chips were so expensive because so
few of them from the yield were stable at higher speeds, Intel is now cranking
out tons of cores that will hit 450 easily. If Intel stuck strictly to the
"sell the chip at its maximum speed" rule, there would be a huge shortage of
mid-range chips, with accompanying high prices (think supply and demand) for
those chips, because so many of the cores rate above mid-range speeds.
Meanwhile, there would be tons of 450's on the shelf that wouldn't be selling
because all of the people who could have afforded them would have already
bought them. Eventually, then, the law of supply and demand would see the
price of the 450's dropping because of the overstocking. This scenario would
ensure people getting 450's for closer to the cost of the now-unavailable (in
our scenario) 300's, which would be crappy for Intel, because it would
encourage people to wait longer before upgrading. Intel, not being stupid,
realizes it's better to sell 450-capable cores at 300 or 333 than to either let
them rot while people are clamoring for mid-range chips, or sell them as 450's
at 300 prices. This "underclocking" practice has been going on for a while now.
I'm somewhat curious as to what tests are performed to determine maximum
reliable speed. It must be a real nut-buster, considering that my SL2QG is
rock solid at 400MHz, cranking out the flyby sequence in Unreal for hours on
end with a core that isn't even as warm as my hand. Informal surveys on the net
confirm that 4 out of every 5 300A's can tag 450 with equal ease, and the one I
swapped my QG out for is doing just that. The reason the Celerons have been
even more successful overclocking than the mid-range P2's is that Intel saved
money by putting slower cache on the mid-range P2's, because P2s don't need the
faster cache to run at the mid-range speed. Tests with the L2 disabled confirm
that the vast majority of those mid-range P2 cores do just fine at 450. The
Celerons are always happy because their caches are made on the same .25 micron
process and die as their cores.
This brings me to the second letter at the OCP, an extensive discussion of
electromigration. I'm not ashamed to admit that I'd never ever heard of the
term electromigration before reading this letter. However, that fact doesn't
keep me from poking holes in the letter's logic. The author states that people
shouldn't overclock because it "speeds up the process EM failure" (I'm assuming
there should be an "of" after process). He also says:
EM failure is very, very difficult to detect until it actually happens . . . .
you have certain probabilities of EM failure. But that's all they are is
probabilities. So it is nearly impossible to "test" the chip to find out if it
has an increasing amount of potential for EM failure . . . especially since
every chip is unique and the characteristics of the chip initially are unknown.
Hmm. So what he's saying is that EM is a Very Bad Thing, and that the crappy
thing is it's impossible to tell if a particular chip is going to fail or not,
you can only make an educated guess based on the failure rates and times of the
same or similar chips made using the same or similar processes. Fine. But
since all of these chips (the cores of the 300A's and SL2W8's we know and love
as well as the "real" P2-400's and 450's) are coming off the same .25 micron
process and the same wafers, and since EM failure is "nearly impossible" to
test for, how can he say the "real" 450's that Intel sells for gobs of money
are any less susceptible to it than the 300A's and SL2W8's?
Assume for a moment that I have 2 "real" 450 cores, according to Intel--and
let's not forget that many of those 450 cores are finding their way into 300's
on up. I slap one into a 450 cartridge, and the other into an SL2W8 cartridge.
I then crank them both up to 450 on respective BH6's. According to the
statement that overclocking speeds up the process of EM failure, the SL2W8 is
going to fail quicker because it's being overclocked. Does anybody actually
buy that argument?
PCGamer has an article on overclocking (which should be titled "An Objective
Study In Overclocking As Told By Intel") in the January 1999 issue. This
article discusses electromigration, as well. The article states (quoting
without permission): "This [electromigration] is a gradual process, where
increased electrical current running through a given circuit causes its
eventual deterioration..." It goes on to state that electromigration takes
years to do damage, but that the process is accelerated by heat.
I don't think PCGamer's definition is really that accurate, since it implies
that if your processor doesn't use increased current, electromigration doesn't
happen at all. I think what they meant to say was that it occurs regardless
and, like heat, additional current accelerates the process. If this is the
case, electromigration during overclocking would occur more quickly only if the
processor were set above its recommended voltage and/or it weren't cooled
properly.
I firmly believe that in the old days, when overclocking a chip resulted in it
being able to double as a hotplate, you were probably gaining speed at the cost
of processor life. It makes perfect sense that a chip running much hotter
than its operating temp. isn't going to last as long. I might even buy that if
you have to add a .3V voltage hike to that 300A of yours to make it stable at
450, you might be killing it more quickly. But given the information in the
EM failure letter and PCGamer's article, I don't buy for a minute that a 300A
at 2.0V, which is cool to the touch at 450, is wearing out any faster than a
"real" P2-450. You might have a statistically higher chance of failure simply
because the 300A has on-die L2 and thus more circuits, but I don't buy that
it'll ***out any quicker at 100MHz bus than at 66. Besides, who's to say the
P2's off-die L2 cache won't ***out due to electromigration?
Given that EM failure is so darn difficult to predict, perhaps one reason the
P2-450 is so expensive is that Intel will give you a new one if EM failure
kills it. But given the fact that, according to an Intel employee, overclocking
leaves no "signature" in the event of EM failure, it would be difficult for
them to deny you the same treatment on your 300A.