Intel Patents On-Chip Cosmic Ray Detectors 100
holy_calamity writes "Intel has been awarded a patent for building cosmic ray detectors into chips, to guard against soft errors where a high energy particle from space changes a value in a circuit. It's a problem that largely only affects RAM. As component sizes shrink futher, "this problem is projected to become a major limiter of computer reliability in the next decade", says the patent. Intel's solution is to build in a detector that responds to cosmic errors by repeating the latest operation, reloading previous instructions, or rolling back to a previous state. You can also read the full patent."
Butterflies (Score:1, Funny)
Re: (Score:2, Funny)
Re: (Score:2)
Re: (Score:1)
Mainframes allegedly already do this (Score:3, Interesting)
Re:Mainframes allegedly already do this (Score:4, Interesting)
Either way, the... 2m by 2m (IIRC) display would detect cosmic rays about once every 2 seconds. This would mean my PC case is perforated by cosmic rays several times each minute. That's not rare.
Re: (Score:2, Interesting)
Re: (Score:3, Interesting)
http://www.allbusiness.com/technology/software-services-applications-programming/6493163-1.html [allbusiness.com]
MIL-STD-1705A radiation-hardened processors would be another choice. This company offers Linux support for what is normally so damned proprietary it's sekret. I don't know their product but just about anything that allows C to supplant ADA and JOVIAL can't be all bad.
Re: (Score:2)
Re: (Score:2)
The real question is, how much energy deposition
Re: (Score:2)
Re: (Score:3, Informative)
Re: (Score:2)
it seems painfully inefficient to 'redo' stuff that doesn't seem to be wrong just because a cosmic ray was detected. it's not like cosmic rays can be easily blocked, either, you could put the computers under a mountain
Re:Mainframes allegedly already do this (Score:4, Insightful)
1) The likelihood of a cosmic ray is ridiculously small. So small in fact that the cost of rewinding progress when they are detected would be completely unnoticeable.
2) We *do* have the ability to package CPUs such that they are protected by CPUs. The problem is that the packages are so large and expensive that no one would buy them given the current probability of soft errors.
So the solution is most definitely NOT to stop shrinking transistors. Even in 10 process technology generations, the mean time to a soft error actually affecting a bit on a CPU is something like 1 million hours. Never mind whether or not that particular soft error is critical.
Re:Mainframes allegedly already do this (Score:4, Informative)
so clearly to a human sized target, the impact ratio is significant.
Re: (Score:2)
we used to detect 1 or 2 hits a week (Score:4, Informative)
More for laughs than anything else, I started logging them and found that a server with 16GB got maybe one ot two hits per week. After that I started to take ECC seriously - for professional quality servers.
You probably don't need it for the domestic appliance quality stuff that people run at home - but for real work, get some decent kit
Difference between a Computer Salesman and ... (Score:1)
Q: How do you tell the difference between a computer salesman and a used car salesman?
A: A used car salesman knows when he is lying.
( My apologies to those computer salesmen who do really understand the technology they sell. Unfortunately there are too many who do not. )
Re:But there really is a memory problem (Score:2, Interesting)
Microsoft's XP crash analysis early in this decade concluded that PCs always left on tended to crash unexpectedly. Dump analysis showed strange values in key OS variables, and cosmic rays (or other bit-blasting particles) were among the likely sources. The conclusion was so clear that Microsoft floated the idea (see URL above) that Vista-generation PCs should use Error-Correcting Code (ECC) mem
Photonic Chips (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
As memory cells become smaller they WILL become sensitive to ionizing radiation.
I have heard this claim before and I have yet to see any kind of credible argument for this. The ionization energy loss of a charged particle penetrating matter is proportional to the distance travelled -- so a smaller memory cell may need less energy to "flip" it, but it will also receive less energy by a passing particle. Thus if the aspect ratio (thickness to length) doesn't change, I see no particular reason why smaller transistors should be "more sensitive" to cosmics. To the contrary: a smaller are
Re: (Score:1)
Re: (Score:1)
In the space sector we have fault-tolerant CPUs that have ALUs and register in triple redundant configurations.
It is exciting that the large CPU manufacturers are taking this seriously now, this might mean that we can fly COTS CPUs in the future space missions (a system I am working on is using a 25 MHz SPARC v7 (you cannot even get the v7 manual anymore), so you can immagine of how big differ
Re: (Score:2)
Most high end machines support ECC RAM if that is what you mean.
No, mainframe CPUs typically run in pairs or triples and are supposed to recover from errors (not just cosmic rays, too).
It is exciting that the large CPU manufacturers are taking this seriously now, this might mean that we can fly COTS CPUs in the future space missions (a system I am working on is using a 25 MHz SPARC v7 (you cannot even get the v7 manual anymore), so you can immagine of how big difference it would be to have Intel stamping out 2-3 GHz rad-tolerant CPUs compared to what we are using now).
I suppose the radiation in space is on a somewhat different level, so you still need special rad-hard chips. I guess you can consider yourself lucky if your locked on SPARC v7 because compared to other options, it's still reasonably close to some industry standard.
Re: (Score:1)
I haven't read the patent.... but I wonder if there are issues with the detector location vs where the event occurs... how many detectors does it take to cover a 20X20 mm chip? is it really worth that real-es
Should be testable (Score:2)
As Mrsooreams wrote just one post below you, it seems not guaranteed that the particles actually hit the detector and not only the computing elements. So I'd take this one with a grain of salt...
How? (Score:5, Interesting)
Re:How? (Score:4, Informative)
It's a lot less likely to cause problems than trying to guess which bit it was, and far less expensive than building a RAIMM(TM) to compensate for it.
Re: (Score:2, Insightful)
As the GP said, there is no way of knowing wheter a cosmic ray passed through you or not. The cosmic ray could easily just smash your bit to a new, random state and pass happily unhindered through the actual detector thingy. Only way to improve the situation would be to build a large detector volume (at least a couple cm^3).
Re: (Score:2)
Re:How? (Score:5, Informative)
With cosmic rays, it's not just "gone". Instead, you get a shower of new energetic particles generated by the collision which compounds the risk of operational errors. The patent specifically mentions alpha particles knocked out of the atoms in the chip by the ray which travel through the circuits causing havoc.
The patent also mentions that the detector may sense side effects of collision (such as voltage spikes) rather than the ray particle itself. Thus, the damage has already been done by the time the detector sees the event.
Re: (Score:2, Funny)
Re: (Score:2)
Re: (Score:3, Informative)
Interaction of radiation with matter (Score:2)
ECC Memory not good enough? (Score:4, Insightful)
Re: (Score:2, Funny)
Yeah, its utterly ridiculous to believe that strange radiation from outer space can mes#[!^ ~` . '
Rollback? Repeat last operation? Not likely. (Score:3, Interesting)
Sometimes the more "esoteric" designers attempt to get simply leads to more potential for disaster.
Cosmic ray detection would be far better for random number generation, than anything else.
Re: (Score:2)
Just fantastic (Score:4, Funny)
patenting ideas is patently stupid... (Score:1)
You do not understand. (Score:2)
The patent office does not insist on working models unless it is an extremely unlikely idea... like perpetual motion, or free energy. There are good reasons for that.
Re: (Score:2)
First, w
I forgot to mention... (Score:2)
Networking possibilities for science? (Score:2)
Does anyone in an relevant field see a good use for this?
Re: (Score:3, Insightful)
Re: (Score:2)
That may be the best argument against the whole 'wisdom of crowds' Web 2.0 thing that I have ever heard.
Shouldn't take long... (Score:3, Interesting)
Current work and contribution of this paper (Score:5, Interesting)
The main problem using codes and everything is that cosmic ray errors cause whats called single event upsets and most codes can not detect 100% of errors where the hamming weight of the error (sum of number of ones in the error vector) is larger than the designed specification of the error. The problem comes when the SEU manifests itself as a multi-bit fault and the error vector cannot be detected by the code. SEU's are the most common type of errors in space application : See http://www.eas.asu.edu/~holbert/eee460/see.html [asu.edu]
The contribution of the cosmic error detector is that if you know you have a cosmic ray at some point in time, you can flush and redo your computation (for computation channels eg microprocessors etc) or flush that line in memory (for memory channels) in case of SEU's and that is a pretty big deal.
Re:Current work and contribution of this paper (Score:5, Interesting)
Re: (Score:1)
Peter
Processor instruction retry (Score:3, Informative)
cosmic (Score:2)
If the problem is with RAM... (Score:4, Informative)
The tricky problem isn't RAM - it's computational elements. There is no single way to error-correct computational elements because they are so diverse. A multiplier would need different protection to an adder which is different from a shift-register. Hence, the idea of rolling back (say) the last instruction executed and having a "do-over".
But for large arrays of homogeneous circuitry - like RAM - this doesn't seem worth the effort.
Re: (Score:3, Informative)
On a regular basis I participate in the "radiation testing" of laptops intended for use on both the Space Shuttle and the International Space Station. This testing is normally done at Indiana University's Cyclotron Facility in B
Cosmic-Ray-Detecting@Home? (Score:2)
Why don't they... (Score:4, Insightful)
Re: (Score:1, Funny)
Re: (Score:1)
Re: (Score:2)
Best Security Vulnerability Ever (Score:1)
Re: (Score:2, Funny)
Re: (Score:2)
DARPA Empire (Score:2)
Attacking the JVM (Score:2, Interesting)
Defensive patent. (Score:3, Interesting)
wait... (Score:2)
Re: (Score:2)
CS student excuses (Score:1)
I have a better solution. (Score:5, Funny)
Re: (Score:2)
Oddly enough, that will not work well for direct impacts although it might be worthwhile at sea level. If you add shielding around the chip and it is directly exposed to a cosmic ray event, the shielding just serves to create a shower of particles which then affect a much larger area and transfer much more energy.
Waste of space. (Score:2)
It is a waste of space.
It would be cheaper (and maybe even lighter) to just radiation-harden the chip.
Free opensource rad tolerant processor here. (Score:2)
Insanity prevails? (Score:1)
Meaning that the insane one was allowed to try and patent this? ;)
Breaking news! (Score:2, Funny)
Google Patents (Score:2)
here [google.com]
CMOS RAM? (Score:2)
I've looked into a few systems that arrived DOA due to a corrupt CMOS RAM (they were OK after resetting them with the jumper on the motherboard) after air shipment from the US to Europe or Asia and I wonder if that's the root cause.
Or you could just do this.... (Score:1)
Re: (Score:1)
Re: (Score:1)
dont recreate the wheel, use ECC xor chipkill. (Score:2)
I really fail to see this as more than some marketing hype for