Eukariote writes "An estimated 18 million laptops with NVidia G84 and G86 graphics chips sold in the past one and a half years are experiencing high failure rates. Various laptop models from multiple manufacturers (Apple, Dell, HP, Lenovo, and others) are affected. NVidia blames it on bad chip packaging causing thermal failure. BIOS updates that turn the laptop fan on more frequently or permanently have been released by Delland HP. The cynical interpretation is that this is likely to only delay the problem until the warranty has expired."
Having to have my laptop fan all of the time to account for a bad chip is an unacceptable fix. It's loud, it takes more electricity to run, and it shortens the life of the fan, and possibly the whole computer as a result.
Note that they conveniently prevent [hp.com] you from downloading the old BIOS to revert the upgrade, by removing old version from their web site, if the increased fan noise is a problem for you. Under the pretense of "avoiding confusion", they will not allow you to get the original version:
I do not see the previous BIOS version on the HP Support Web? What happened to the previous versions of the BIOS?
In order to eliminate any confusion on which BIOS version is the latest, only the latest version is available on the Web.
Ford tried to do this to me with my car. It would make a shuddering noise somewhere in the front end at low speed (eg parking lots). I mentioned it to them each service and they said they'd look at it, and when I got it back after the service they said they'd flushed the power steering system and upgraded the car computer firmware.
The first service after the warranty expired I took it in and they said that there was a faulty hose causing the problem and it would take $$$ to fix. I got them to fix it under warranty eventually but I wonder how many other people they screwed over...
Hey, I benefited from almost the exact same problem. I was test driving a program car, and it drove like a dream - except when I turned the steering wheel all the way to the side. Then it sounded like someone had a low-speed metal grinder under the hood. I told the salesman, and he pulled the car's record to look into it. As it turns out, the previous owner had tried to get that noise fixed maybe 5 times, but their dealer couldn't permanently repair it, so the owner returned it under their state's lemon law. My local dealer asked if I'd be interested in the car if they could fix it, so I went home to let them dig around inside.
As it turns out, there's a corrugated metal hose near the steering mechanism. When you turned the wheel all the way, it pushed a motor against that hose and caused the noise. The permanent fix? A plastic wire tie to pull the hose half an inch to the side. I got the car in mint condition for half price eight years ago, and I'm still driving it today.
Ummm, this is an obligatory car analogy to the laptops, so don't mod me off-topic.
I'm sitting on a lenovo laptop with an Nvidia and the fan doesnt come on under normal conditions until the laptop is what i would consider too hot (~65 celsius).
Dude, don't sit on the laptop then, it's no surprise that it gets hot.
I don't know about the US, but in the EU, you are "entitled to have the goods brought into conformity free of charge by repair or replacement" even if they aren't broken as such (i.e. dead).
If you choose to take the cynical interpretation why not ignore the update and hope it fails in the warranty? Of course if you do that and it fails not long after the warranty then you'll have only yourself to blame for being a cynical bastard.
You have the possibility of doing the opposite:
leave your machine on often.
And often stress the GPU for abnormal periods of time.. (i.e. leave a non-stop hardware-accelerated graphics processes running in order to tack out the GPU for a few thousand hours)
Presumably, if your GPU is faulty, it should fail during the extreme system stress testing.
And if it survives you can apply the BIOS patch afterwards if you see fit, confident that the GPU is fine.
Here are exerpts from the most amusing description [theinquirer.net] of the problem:
All Nvidia G84 and G86s are bad
The short story is that all the G84 and G86 parts are bad. Period. No exceptions. All of them, mobile and desktop, use the exact same ASIC, so expect them to go south in inordinate numbers as well. There are caveats however, and we will detail those in a bit.
Both of these ASICs have a rather terminal problem with unnamed substrate or bumping material, and it is heat related. If you ask Nvidia officially, you will get no reason why this happened, and no list of parts affected, we tried. Unofficially, they will blame everyone under the sun, and trash their suppliers in very colourful language.
When the process engineers pinged by the INQ picked themselves off the floor from laughing, they politely said that there is about zero chance that NV would change the assembly process or material set for a batch, much less an EOL part.
For dessert, there's this [theinquirer.net] article to finish:)
by Anonymous Coward
on Thursday July 31 2008, @10:16PM (#24427817)
i think that the better quality control of apple makes my computer immune to the problem, the genius bar can surely fix this problem and replace the computer for a new one, try this with dell.
I heard that Steve Jobs can smell a faulty substrate, even if it isnt going to fail for years, and that he personally sniffs every chip that goes into the production line to protect us Apple fans!
Separately, NVIDIA plans to take a one-time charge from $150 million to $200 million against cost of revenue for the second quarter to cover anticipated warranty, repair, return, replacement and other costs and expenses, arising from a weak die/packaging material set in certain versions of its previous generation GPU and MCP products used in notebook systems. Certain notebook configurations with GPUs and MCPs manufactured with a certain die/packaging material set are failing in the field at higher than normal rates. To date, abnormal failure rates with systems other than certain notebook systems have not been seen. NVIDIA has initiated discussions with its supply chain regarding this material set issue and the Company will also seek to access insurance coverage for this matter.
Regarding the notebook field failures, NVIDIA president and CEO Jen-Hsun Huang stated:
"Although the failure appears related to the combination of the interaction between the chip material set and system design, we have a responsibility to our customers and will take our part in resolving this problem. The GPU has become an increasingly important part of the computing experience and we are seeing more interest by PC OEMs to adopt GPUs in more platforms. Recognizing that the GPU is one of the most complex processors in the system, it is critical that we now work more closely with notebook system designers and our chip foundries to ensure that the GPU and the system are designed collaboratively for the best performance and robustness."
Today's high performance notebooks are highly complex systems with extreme thermal environments. The combination of limited thermal management and frequent power cycling is particularly challenging for complex processors like the GPU.
Huang added, "This has been a challenging experience for us. However, the lessons we've learned will help us build far more robust products in the future, and become a more valuable system design partner to our customers. As for the present, we have switched production to a more robust die/package material set and are working proactively with our OEM partners to develop system management software that will provide better thermal management to the GPU."
As detailed in this thread [laptopvideo2go.com], the GF8400 has serious performance problems under Vista Aero when running recent driver versions. I wonder if this is related? - i.e. Recent driver updates have down-clocked the GPU leading to bad performance.
Dell have however recently acknowledge the problem and is working on a fix.
Charlie gets it right. Let's see, 18 million notebook machines. Freight each way, plus cost of labor to fix them and the materials needed. Less than $10 a machine! Great, that math stuff. Yup, a $150-200 million charge oughta do it at around $10 a machine!
Hello? This is the SEC? Hey, I have a question about an 8K I saw for NVidia. It goes like this.....
Does this have anything to do with the Xbox 360's Red Ring of Death [wikipedia.org]? And do these problems, in turn, have something to do with RoHS [wikipedia.org] certification, due to lead-free solders being less durable?
Nvidia has been said to have had a hand in the design of some parts of the 360, and the problem sounds like it is identical.
That said, on my own laptop (a Dell Inspiron 6000i) sees at least 8 hours a day of actual use, and is generally powered on at least 20 hours per day. The default fan control keeps the fan spinning all the time at smoothly varied speeds, with a heavy tendency to keep it spinning at high speed for long periods of time following heavy loads. This is very annoying to me.
Instead, I run i8kfangui, which lets me control (based on the temperature of the CPU, GPU, RAM, or hard drive) the fan's speed. It keeps dust accumulation and noise down, and works pretty well. The tradeoff is that it (by my choice) keeps the CPU in a constant and dramatic swing between 52 and 43 degrees Celcius: The fan is simply off below 43C, then turns at low speed once the CPU reaches 52C. If it gets to 68C (which almost never happens, and is quite hot for a CPU) it spins at high speed. I find this behavior to be very preferable.
But the point is that it is generally a slow climb to 52C, and a fast fall to 43C, over and over in an abusive thermal-stress scenario. This cycle repeats a dozen or so times per hour, 8-20 hours per day, and has done so for three years. It works fine,
The motherboard is not RoHS compliant, and so presumably was built with lead-based solder. However it seems that most new machines are built with lead-free solders [wikipedia.org], all of which seem to have various problems.
Are there any metallurgists in the house who might care to speculate on the relationship between lead-free solders and systemic failure of laptops due to heat cycling?
I'd read that the 360 had certain component(s) designed by Microsoft in-house (as a cost-saving measure), which had lousy thermal characteristics, and which they sought the help of nVidia to rectify. I'm unable to find a reference at this time, but I do believe my statement to be true, whether or not the GPU in the 360 is an ATI part.
Although RoHS probably contributed to the RRoD, mostly it was an improper thermal solution. There was an article awhile back where it was discovered that Microsoft engineers decided to cut costs by designing the heatsink system themselves. Insufficient cooling and an improper mounting system allowed the board to warp more than the RoHS solder could handle. Newer 360's have lots of extra epoxy around the package to keep it from pulling too far away from the motherboard.
For example, flip-chip technology uses a solder BGA to connect the silicon ship to a substrate. That substrate is then also usually connected to a motherboard through a solder BGA.
The lack of lead in solder is a technological issue and as such is solved by more advanced technology. Certainly there are few people here who are opposed to higher technology?
Sure we can whine about the extra work we are forced to do, or the fact that we have to pay for higher technology, but what good does that do. As technologically savvy people we live for the chance to advance the technology. We see these opportunities all over the place. Smaller cars require innovate means to increase safety and power. Smaller computers require more power efficient components and better batteries. Have one type of plastic go away just opens up a space for innovative new plastics. this is what makes the world exciting.
So, if some company can't keep up, then they just suck as technologist and need to go away. A car company can't make technologically advanced cars, screw them. A video card manufacturer can't keep up with the trends and make a reliable video card, screw them too. I have involved in a number of situations where the process had to be rethought. Someone whines that a baby might be born with defect and we can't use this chemical. Someone complains that the dust will give them cancer and we must use a hood. Someone complains that we can't reliably dispose of an agent, and we must switch agents. Sure, we could say who cares if some worker dies. So what? But in each case the change was made, and technology gave us an equal or better solution.
It is always easier to blame failure of the external forces rather than taking responsibility for a personal lack of creativity. This change is solder is not the first scape goat used by the those that lack innovative solutions, and won't be the last. There will always be firms that say a problem can't be solved, and they will be generally over thrown by those who then find the solution. I think that any number of lazy American firms are discovering that right now, while others are riding the way of can-do innovations.
Your comment assumes that higher technology is always better.
Sometimes what you need is a hammer, not a jackhammer. I'm not convinced the massive failures all over the place that result from using lead-free solder are worth the incremental environmental benefit.
The HP DV2000 DV6000 and DV9000 series laptops are all affected. The BIOS updates just make the fan spin more often, thats it. HP has extended the MFG warranties to 2 years from the date of purchase. At GeekSquad/Best Buy HP has been offering a LOT of replacements for these laptops authorized through HP, but the laptops have to be DOA and sent to service which takes about a week to two weeks. I've sent off atleast 15 HP laptops in the past 6 months for replacement/repair. I give HP some credit for atleast trying to fix the problem and/or replace the whole laptops themselves. I don't know what other MFG's are doing..
The link to the HP "Service Enhancement" (gotta love marketing) saved my butt. I had a DV2000 laptop do exactly this, just a week or so after reading this [consumerist.com] article on The Consumerist.
I called HP and, after convincing the tech support guy that removing Vista and installing XP on the laptop did NOT cause the problem, sent it off for repairs in the middle of June. I was given a 2 week time period for it to be finished.
After a week and a half they sent me an e-mail saying that parts were on order and it might be another week. So July 8th was the new date.
After the 9th I called HP again and again was told parts were still on order. I was given a new date of July 22nd! I e-mailed HP's CEO [hp.com] and was contacted a few days later. HP said that they had been authorized to replace this series of laptop and asked me to fax in the specs from the broken one, which I did. About 2 weeks later a laptop was shipped to my old address (after having given HP the new one on 3 occasions: when I first called tech support, when I e-mailed the CEO, and when the case manager contacted me).
The laptop arrived and so far the only thing that doesn't work is DVD burning. Sure, it gets about 92% done, then dies. I've given up though and decided to just not buy HP products anymore.
To those who are having the problems mentioned for HP I strongly suggest sending an e-mail to Mark Hurd, the CEO. He doesn't write back personally obviously but someone contacted me just a day or two later.
It's just too bad HP has come to this (whether it's nVidia's fault or not is open to debate) but after an issue arises it is up to the manufacturer to take responsibility for their products. Man, I remember the days of HP meaning quality, the 2, 3, 4, and 5 series of laser printers were slow, sure, but they were steel and lasted forever. Now they sell these plastic pieces of crap that die after a year and, when contacted, all HP will do is give you $50 off of a new one. Wow, did Carly destroy HP or what?
... I would "stress test" the hell out of it more so if the manufacturer will be replacing it with an Intel or ATI GPU...
Sure this might be borderline immoral but aren't the laptop manufacturers in conjunction with nVidia acting in bad faith by not replacing the defective laptops with non defective ones? BIOS updates to run the fans all the time is not the real solution.
Its price is the lowest since 1990 ($4.2 today);
Just fired its CEO;
Very favorable reviews for upcoming ATI4xxx GPU;
Troubles for NV;
What do ya thinking?
The USAF had a reliability program that ran from the mid-1960s to the mid-1980s which did quite a bit to make electronics more reliable in the field. About 1% of the USAF's "black boxes" were marked with stickers that said something like "USAF Reliability Program Unit - If unit breaks, replace entire unit and send broken unit to... for analysis".
When broken units came into the analysis shop, a considerable effort was made to find out exactly which component had failed and how it had failed. This went way beyond normal repair. When a bad part was located, the part was opened up and examined with an electron microscope or X-rayed, as appropriate, to see exactly what had gone wrong.
The USAF would frequently publish pictures from this program in Aviation Week. You'd see pictures of bad lead joints inside an IC package, too-long internal leads that had failed
under high G loads, and bad on-chip etching. Manufacturers of bad parts were named. Inspectors were sent to plants to figure out what had gone wrong with the manufacturing process. The problem got fixed or the supplier stopped getting military contracts.
This worked well when the military bought most electronic components. By the 1980s, consumer electronics were using electronics at least as sophisticated as the military, and the military had to start using "commercial, off the shelf" components. Today, the USAF has trouble getting any special attention from parts suppliers.
Auto manufacturers still do things like this. Because they have to pay for recalls, they need to find out why things break and fix the production process, even if it's at a supplier.
There is a problem with the chips, there is no doubt about that. However take anything Charlie says about it with a huge truckload of salt. There was a bit of bad blood between Nvidia and Charlie years ago (something like 4 or 5 now), and ever since they've refused to talk to anyone from the Inquirer and Charlie specifically.
It seems these days that all [theinquirer.net] Charlie [theinquirer.net] does [theinquirer.net] is [theinquirer.net] write [theinquirer.net] long [theinquirer.net] article [theinquirer.net] bashing [theinquirer.net] Nvidia [theinquirer.net]. That is unless he's writing an article that's so over the top that his editor has to pull it [theinquirer.net] (yes, believe it or not, there actually is an editor in charge of all those pieces).
Go read dell or HP forums and EE times. Read The Inq only if you want some amusement to see how amazingly slanted of a story can be produced.
It seems these days that all Charlie does is write long article bashing Nvidia. That is unless he's writing an article that's so over the top that his editor has to pull it (yes, believe it or not, there actually is an editor in charge of all those pieces).
"The power of accurate observation is frequently called cynicism by those who don't have it." - George Bernard Shaw
The question you raise I'll restate as: Is what Charlie saying wrong? I prefer Nvidia to ATI because of their Linux drivers. But drivers alone a complete system does not make. Real is real and the truth is truth.
On Inspiron 1420s the Nvidia is an option - and was back in early 2007 when I got mine. Unless you specifically paid for the 'better' chip, you got an Intel® GM965 Express chipset, with Graphics Media Accelerator X3100.
by Anonymous Coward
on Thursday July 31 2008, @10:24PM (#24427881)
A link? Shit I own one. Dell XPS m1330; I've had the motherboard replaced twice already for video failure, and I got the thing in September of 07. Yes, that's right, replaced twice in less than a year.
The flaw is every bit as bad as everyone makes it out to be.
Personally, I've never used my display on my MacBookPro. The UI on OSX is so wonderful, that I do not even have to look at it. I practically imagine what I want to open, and it opens it for me! This coupled with the nice sounds, let me know when I've opened the right application. If worst comes to worst, I can just use the option key combos to start my music, to start web-browsing etc.
I've never used it, so to be honest, I don't see why anyone would want such a feature, let a lone need it.
Unfortunately people think there's a difference with a macbook logic board (intel *coughs*) and an intel motherboard. Though a fan of OS X, Apple needs to give up on putting their apple logo stickers over the original 3rd party vendors hardware. It's a fucking PC/laptop with EFI.
My MacBookPro turned on one morning, and everything worked but the display. I managed to log in, launch iTunes and play some music, but no graphics output. A trip to the Apple store later and I'm out a machine for a week. Never had an explanation, but now I am curious if i should send it back and ask for a new logic board with a graphics chip that isn't going to fail again prematurely due to faulty design.
Well, unless your replaced logic board fails again, I don't think Apple would take it back for replacement, since it basically works. Unfortunately, the affected GPUs are basically the entire nVidia 8x00 line (except for desktop 8300, and all the 8800's). Very few laptops actually use the 8800M GPU (think gaming laptops), so any other replacement, even a new laptop with an nVidia chipset will likely have the problematic GPU. The other alternative is to find a laptop with an AMD/ATi or Intel GPU.
They're not actually shipping the affected product anymore, so presumably if you get a newly enough manufactured replacement part, you won't have the problem on the new piece of equipment.
But it is Nvidia's fault because they signed off on these cooling units.
That is like saying it isn't your car maker's fault if they put breaks in your car designed for a lawnmower and instead it is obviously the people who are making these lawnmower breaks fault for not making sure they can break a much heavier car...
From what I'm reading the issue isn't with fans not performing as expected. The issue is that at the performance rate Nvidia had them at they simply didn't do the job needed and resulting in the GPU overheating and destroying its self.
It is entirely, 100% Nvidia's fault. If you put in substandard parts you get a substandard result.
Agreed. Most reference coolers (and even a lot of 3rd party ones) aren't worth the cheap plastic used to make them. When I pulled the ref cooler off my 8800GT last year I was shocked to find that the fan didn't even sit completely atop the core, and that there was a LOT of excess thermal paste and stupidly thick thermal pads. It's little suprise the card was heatsoaking to 90C after a few hours of Bioshock and crashing itself! I can only cringe in horror when I imagine something like that stuffed into a freaking laptop. Fortunatly I had already planned on replacing the stock cooler (just a big heatpipe/heatsink with a 120mm fan ziptied to it) and lo and behold my card now has trouble hitting low 40's even after hours of flogging.
\
Long story short, all manufacturers should be held accountable for the idiotic shortcuts they take when it comes to cooling their electronics. Its kind of an important aspect of electronics, no? Why not spend a buck or two more on something that actually does the job? Till then the first thing I do with any graphics card (or CPU for that matter) is still going to be to chuck the stock cooler into my parts bin, and then look for something bigger or better.
Why is it all Nvidia's fault, seems to me it should be a shared responsibilty.
I work for a company big into mobile IC design (like NVIDIA). And I can say that it is very likely NVIDIA's fault because they (as do we), as the design company, specify every last detail of process, circuit, and package, when it comes to IC fabrication. Additionally, the company which produced these chips--TSMC--is the oldest, largest, and possibly most reliable dedicated fab company in existence. If there is a heat dissipation problem, it almost certainly stems from engineering oversight or management's corner-cutting on NVIDIA's part.
My DELL XPS M1710 has a 7950GTX and never had any issues. The DELL BIOS does have some issues with heat management so I run l8kfan to keep heat at acceptable levels. On top of that, did you know most new DELL laptops (confirmed on XPS and VOSTRO) wont read S.M.A.R.T? I think heat killed my original hard drive but the BIOS wouldn't report the drive was going bad. They should fire whoever made the decision that removing this feature was an improvement.
Literal interpretation (Score:5, Insightful)
Re:Literal interpretation (Score:5, Interesting)
Note that they conveniently prevent [hp.com] you from downloading the old BIOS to revert the upgrade, by removing old version from their web site, if the increased fan noise is a problem for you. Under the pretense of "avoiding confusion", they will not allow you to get the original version:
I do not see the previous BIOS version on the HP Support Web? What happened to the previous versions of the BIOS? In order to eliminate any confusion on which BIOS version is the latest, only the latest version is available on the Web.
Parent
Re:Literal interpretation (Score:5, Interesting)
Ford tried to do this to me with my car. It would make a shuddering noise somewhere in the front end at low speed (eg parking lots). I mentioned it to them each service and they said they'd look at it, and when I got it back after the service they said they'd flushed the power steering system and upgraded the car computer firmware.
The first service after the warranty expired I took it in and they said that there was a faulty hose causing the problem and it would take $$$ to fix. I got them to fix it under warranty eventually but I wonder how many other people they screwed over...
Parent
Re:Literal interpretation (Score:5, Interesting)
Hey, I benefited from almost the exact same problem. I was test driving a program car, and it drove like a dream - except when I turned the steering wheel all the way to the side. Then it sounded like someone had a low-speed metal grinder under the hood. I told the salesman, and he pulled the car's record to look into it. As it turns out, the previous owner had tried to get that noise fixed maybe 5 times, but their dealer couldn't permanently repair it, so the owner returned it under their state's lemon law. My local dealer asked if I'd be interested in the car if they could fix it, so I went home to let them dig around inside.
As it turns out, there's a corrugated metal hose near the steering mechanism. When you turned the wheel all the way, it pushed a motor against that hose and caused the noise. The permanent fix? A plastic wire tie to pull the hose half an inch to the side. I got the car in mint condition for half price eight years ago, and I'm still driving it today.
Ummm, this is an obligatory car analogy to the laptops, so don't mod me off-topic.
Parent
Re:Let me guess.... (Score:5, Funny)
The article about trolling is the next one down. Easy mistake to make.
Parent
Re:Literal interpretation (Score:5, Funny)
I'm sitting on a lenovo laptop with an Nvidia and the fan doesnt come on under normal conditions until the laptop is what i would consider too hot (~65 celsius).
Dude, don't sit on the laptop then, it's no surprise that it gets hot.
Parent
Re:Literal interpretation (Score:4, Funny)
Is "I'm sitting on [x]" used in the US or UK as a way of saying that you own/use [x] at all?
Yes, for [x] == chair, bench, bean bag, younger brother, etc.
Parent
Today's fun fact (Score:5, Insightful)
Re:Today's fun fact (Score:4, Informative)
Parent
Re:Today's fun fact (Score:4, Insightful)
If you choose to take the cynical interpretation why not ignore the update and hope it fails in the warranty? Of course if you do that and it fails not long after the warranty then you'll have only yourself to blame for being a cynical bastard.
You have the possibility of doing the opposite: leave your machine on often.
And often stress the GPU for abnormal periods of time.. (i.e. leave a non-stop hardware-accelerated graphics processes running in order to tack out the GPU for a few thousand hours)
Presumably, if your GPU is faulty, it should fail during the extreme system stress testing.
And if it survives you can apply the BIOS patch afterwards if you see fit, confident that the GPU is fine.
Parent
Nvidia appears to be screwed... (Score:5, Informative)
All Nvidia G84 and G86s are bad
The short story is that all the G84 and G86 parts are bad. Period. No exceptions. All of them, mobile and desktop, use the exact same ASIC, so expect them to go south in inordinate numbers as well. There are caveats however, and we will detail those in a bit.
Both of these ASICs have a rather terminal problem with unnamed substrate or bumping material, and it is heat related. If you ask Nvidia officially, you will get no reason why this happened, and no list of parts affected, we tried. Unofficially, they will blame everyone under the sun, and trash their suppliers in very colourful language.
When the process engineers pinged by the INQ picked themselves off the floor from laughing, they politely said that there is about zero chance that NV would change the assembly process or material set for a batch, much less an EOL part.
For dessert, there's this [theinquirer.net] article to finish :)
Is my macbook faulty? (Score:3, Funny)
Re:Is my macbook faulty? (Score:5, Funny)
I heard that Steve Jobs can smell a faulty substrate, even if it isnt going to fail for years, and that he personally sniffs every chip that goes into the production line to protect us Apple fans!
Parent
NVIDIA's Official Statement (Score:5, Informative)
The GF8400 has other (or related) problems on Dell (Score:5, Interesting)
You just know there's a class action out there.... (Score:5, Interesting)
waiting to form.
Charlie gets it right. Let's see, 18 million notebook machines. Freight each way, plus cost of labor to fix them and the materials needed. Less than $10 a machine! Great, that math stuff. Yup, a $150-200 million charge oughta do it at around $10 a machine!
Hello? This is the SEC? Hey, I have a question about an 8K I saw for NVidia. It goes like this.....
Are the enviromentralists killing our PCs? (Score:3, Interesting)
Does this have anything to do with the Xbox 360's Red Ring of Death [wikipedia.org]? And do these problems, in turn, have something to do with RoHS [wikipedia.org] certification, due to lead-free solders being less durable?
Nvidia has been said to have had a hand in the design of some parts of the 360, and the problem sounds like it is identical.
That said, on my own laptop (a Dell Inspiron 6000i) sees at least 8 hours a day of actual use, and is generally powered on at least 20 hours per day. The default fan control keeps the fan spinning all the time at smoothly varied speeds, with a heavy tendency to keep it spinning at high speed for long periods of time following heavy loads. This is very annoying to me.
Instead, I run i8kfangui, which lets me control (based on the temperature of the CPU, GPU, RAM, or hard drive) the fan's speed. It keeps dust accumulation and noise down, and works pretty well. The tradeoff is that it (by my choice) keeps the CPU in a constant and dramatic swing between 52 and 43 degrees Celcius:
The fan is simply off below 43C, then turns at low speed once the CPU reaches 52C. If it gets to 68C (which almost never happens, and is quite hot for a CPU) it spins at high speed. I find this behavior to be very preferable.
But the point is that it is generally a slow climb to 52C, and a fast fall to 43C, over and over in an abusive thermal-stress scenario. This cycle repeats a dozen or so times per hour, 8-20 hours per day, and has done so for three years. It works fine,
The motherboard is not RoHS compliant, and so presumably was built with lead-based solder. However it seems that most new machines are built with lead-free solders [wikipedia.org], all of which seem to have various problems.
Are there any metallurgists in the house who might care to speculate on the relationship between lead-free solders and systemic failure of laptops due to heat cycling?
Re: (Score:3, Informative)
Does this have anything to do with the Xbox 360's Red Ring of Death [wikipedia.org]? [...]
Nvidia has been said to have had a hand in the design of some parts of the 360, and the problem sounds like it is identical.
Xbox 360 has ATI graphics. You must be thinking of the original Xbox, which did use NVIDIA graphics.
Re:Are the enviromentralists killing our PCs? (Score:5, Interesting)
I'd read that the 360 had certain component(s) designed by Microsoft in-house (as a cost-saving measure), which had lousy thermal characteristics, and which they sought the help of nVidia to rectify. I'm unable to find a reference at this time, but I do believe my statement to be true, whether or not the GPU in the 360 is an ATI part.
Parent
Re:Are the enviromentralists killing our PCs? (Score:4, Informative)
Parent
Re:Are the enviromentralists killing our PCs? (Score:5, Insightful)
Although RoHS probably contributed to the RRoD, mostly it was an improper thermal solution. There was an article awhile back where it was discovered that Microsoft engineers decided to cut costs by designing the heatsink system themselves. Insufficient cooling and an improper mounting system allowed the board to warp more than the RoHS solder could handle. Newer 360's have lots of extra epoxy around the package to keep it from pulling too far away from the motherboard.
Parent
Is solder used inside chips? (Score:3, Insightful)
Sounds like you're drawing a long bow to me.
The problem here sounds like it's inside the chips themselves.
I'm no metallurgist or hardware expert but I'd have thought solder is used when mounting the chips to the board, not inside the board itself.
Re: (Score:3, Informative)
For example, flip-chip technology uses a solder BGA to connect the silicon ship to a substrate. That substrate is then also usually connected to a motherboard through a solder BGA.
See: http://en.wikipedia.org/wiki/Flip_chip [wikipedia.org]
And: http://en.wikipedia.org/wiki/Ball_grid_array [wikipedia.org]
Re:Are the enviromentralists killing our PCs? (Score:5, Insightful)
Sure we can whine about the extra work we are forced to do, or the fact that we have to pay for higher technology, but what good does that do. As technologically savvy people we live for the chance to advance the technology. We see these opportunities all over the place. Smaller cars require innovate means to increase safety and power. Smaller computers require more power efficient components and better batteries. Have one type of plastic go away just opens up a space for innovative new plastics. this is what makes the world exciting.
So, if some company can't keep up, then they just suck as technologist and need to go away. A car company can't make technologically advanced cars, screw them. A video card manufacturer can't keep up with the trends and make a reliable video card, screw them too. I have involved in a number of situations where the process had to be rethought. Someone whines that a baby might be born with defect and we can't use this chemical. Someone complains that the dust will give them cancer and we must use a hood. Someone complains that we can't reliably dispose of an agent, and we must switch agents. Sure, we could say who cares if some worker dies. So what? But in each case the change was made, and technology gave us an equal or better solution.
It is always easier to blame failure of the external forces rather than taking responsibility for a personal lack of creativity. This change is solder is not the first scape goat used by the those that lack innovative solutions, and won't be the last. There will always be firms that say a problem can't be solved, and they will be generally over thrown by those who then find the solution. I think that any number of lazy American firms are discovering that right now, while others are riding the way of can-do innovations.
Parent
Change for change's sake good? (Score:3, Insightful)
Your comment assumes that higher technology is always better.
Sometimes what you need is a hammer, not a jackhammer. I'm not convinced the massive failures all over the place that result from using lead-free solder are worth the incremental environmental benefit.
Desktop chips too, or only laptops? (Score:4, Interesting)
Are any desktop chips affected, or only laptop chips?
Re: (Score:3, Informative)
Are any desktop chips affected, or only laptop chips?
According to TFA both desktop and laptop chips are affected.
HP (Score:4, Interesting)
What are we talkin' about??? (Score:5, Informative)
Sorry, I was distracted by the picture of the BREASTS on TFA page
My laptop has been in the shop for 2 months now (Score:3, Informative)
I called HP and, after convincing the tech support guy that removing Vista and installing XP on the laptop did NOT cause the problem, sent it off for repairs in the middle of June. I was given a 2 week time period for it to be finished.
After a week and a half they sent me an e-mail saying that parts were on order and it might be another week. So July 8th was the new date.
After the 9th I called HP again and again was told parts were still on order. I was given a new date of July 22nd! I e-mailed HP's CEO [hp.com] and was contacted a few days later. HP said that they had been authorized to replace this series of laptop and asked me to fax in the specs from the broken one, which I did. About 2 weeks later a laptop was shipped to my old address (after having given HP the new one on 3 occasions: when I first called tech support, when I e-mailed the CEO, and when the case manager contacted me).
The laptop arrived and so far the only thing that doesn't work is DVD burning. Sure, it gets about 92% done, then dies. I've given up though and decided to just not buy HP products anymore.
To those who are having the problems mentioned for HP I strongly suggest sending an e-mail to Mark Hurd, the CEO. He doesn't write back personally obviously but someone contacted me just a day or two later.
It's just too bad HP has come to this (whether it's nVidia's fault or not is open to debate) but after an issue arises it is up to the manufacturer to take responsibility for their products. Man, I remember the days of HP meaning quality, the 2, 3, 4, and 5 series of laser printers were slow, sure, but they were steel and lasted forever. Now they sell these plastic pieces of crap that die after a year and, when contacted, all HP will do is give you $50 off of a new one. Wow, did Carly destroy HP or what?
If my laptop had an nVidia GPU... (Score:3, Interesting)
... I would "stress test" the hell out of it more so if the manufacturer will be replacing it with an Intel or ATI GPU...
Sure this might be borderline immoral but aren't the laptop manufacturers in conjunction with nVidia acting in bad faith by not replacing the defective laptops with non defective ones? BIOS updates to run the fans all the time is not the real solution.
Is it time to pick up some AMD stocks? (Score:3, Interesting)
How to get reliabilty, although it won't happen. (Score:5, Interesting)
The USAF had a reliability program that ran from the mid-1960s to the mid-1980s which did quite a bit to make electronics more reliable in the field. About 1% of the USAF's "black boxes" were marked with stickers that said something like "USAF Reliability Program Unit - If unit breaks, replace entire unit and send broken unit to ... for analysis".
When broken units came into the analysis shop, a considerable effort was made to find out exactly which component had failed and how it had failed. This went way beyond normal repair. When a bad part was located, the part was opened up and examined with an electron microscope or X-rayed, as appropriate, to see exactly what had gone wrong.
The USAF would frequently publish pictures from this program in Aviation Week. You'd see pictures of bad lead joints inside an IC package, too-long internal leads that had failed under high G loads, and bad on-chip etching. Manufacturers of bad parts were named. Inspectors were sent to plants to figure out what had gone wrong with the manufacturing process. The problem got fixed or the supplier stopped getting military contracts.
This worked well when the military bought most electronic components. By the 1980s, consumer electronics were using electronics at least as sophisticated as the military, and the military had to start using "commercial, off the shelf" components. Today, the USAF has trouble getting any special attention from parts suppliers.
Auto manufacturers still do things like this. Because they have to pay for recalls, they need to find out why things break and fix the production process, even if it's at a supplier.
Another ranti from Charile (Score:5, Informative)
There is a problem with the chips, there is no doubt about that. However take anything Charlie says about it with a huge truckload of salt. There was a bit of bad blood between Nvidia and Charlie years ago (something like 4 or 5 now), and ever since they've refused to talk to anyone from the Inquirer and Charlie specifically.
It seems these days that all [theinquirer.net] Charlie [theinquirer.net] does [theinquirer.net] is [theinquirer.net] write [theinquirer.net] long [theinquirer.net] article [theinquirer.net] bashing [theinquirer.net] Nvidia [theinquirer.net]. That is unless he's writing an article that's so over the top that his editor has to pull it [theinquirer.net] (yes, believe it or not, there actually is an editor in charge of all those pieces).
Go read dell or HP forums and EE times. Read The Inq only if you want some amusement to see how amazingly slanted of a story can be produced.
Re:Another ranti from Charile (Score:4, Interesting)
It seems these days that all Charlie does is write long article bashing Nvidia. That is unless he's writing an article that's so over the top that his editor has to pull it (yes, believe it or not, there actually is an editor in charge of all those pieces).
"The power of accurate observation is frequently called cynicism by those who don't have it." - George Bernard Shaw
The question you raise I'll restate as: Is what Charlie saying wrong? I prefer Nvidia to ATI because of their Linux drivers. But drivers alone a complete system does not make. Real is real and the truth is truth.
-[d]-
Parent
Re:Model numbers (Score:5, Informative)
Here are the Dell models which have BIOS updates, from TFA:
Inspiron 1420
Latitude D630
Latitude D630c
Dell Precision M2300
Vostro Notebook 1310
Vostro Notebook 1400
Vostro Notebook 1510
Vostro Notebook 1710
XPS M1330
XPS M1530
Parent
Re: (Score:3, Funny)
On Inspiron 1420s the Nvidia is an option - and was back in early 2007 when I got mine. Unless you specifically paid for the 'better' chip, you got an Intel® GM965 Express chipset, with Graphics Media Accelerator X3100.
Re:Model numbers (Score:5, Interesting)
A link? Shit I own one. Dell XPS m1330; I've had the motherboard replaced twice already for video failure, and I got the thing in September of 07. Yes, that's right, replaced twice in less than a year.
The flaw is every bit as bad as everyone makes it out to be.
Parent
Re:Model numbers (Score:5, Informative)
This was reported by the inquirer (and here, i think) a few weeks ago, but apparently the news hasn't been getting around..
Parent
Re:Oh, So That's What Happened... (Score:4, Funny)
Parent
Re:Oh, So That's What Happened... (Score:5, Funny)
Personally, I've never used my display on my MacBookPro. The UI on OSX is so wonderful, that I do not even have to look at it. I practically imagine what I want to open, and it opens it for me! This coupled with the nice sounds, let me know when I've opened the right application. If worst comes to worst, I can just use the option key combos to start my music, to start web-browsing etc.
I've never used it, so to be honest, I don't see why anyone would want such a feature, let a lone need it.
Parent
Re: (Score:3, Informative)
Re:Oh, So That's What Happened... (Score:5, Insightful)
Unfortunately people think there's a difference with a macbook logic board (intel *coughs*) and an intel motherboard. Though a fan of OS X, Apple needs to give up on putting their apple logo stickers over the original 3rd party vendors hardware. It's a fucking PC/laptop with EFI.
Parent
Re:Oh, So That's What Happened... (Score:5, Informative)
Well, unless your replaced logic board fails again, I don't think Apple would take it back for replacement, since it basically works. Unfortunately, the affected GPUs are basically the entire nVidia 8x00 line (except for desktop 8300, and all the 8800's). Very few laptops actually use the 8800M GPU (think gaming laptops), so any other replacement, even a new laptop with an nVidia chipset will likely have the problematic GPU. The other alternative is to find a laptop with an AMD/ATi or Intel GPU.
Parent
Re:Oh, So That's What Happened... (Score:4, Interesting)
They're not actually shipping the affected product anymore, so presumably if you get a newly enough manufactured replacement part, you won't have the problem on the new piece of equipment.
Parent
Re:So, is it not fair (Score:5, Insightful)
But it is Nvidia's fault because they signed off on these cooling units.
That is like saying it isn't your car maker's fault if they put breaks in your car designed for a lawnmower and instead it is obviously the people who are making these lawnmower breaks fault for not making sure they can break a much heavier car...
From what I'm reading the issue isn't with fans not performing as expected. The issue is that at the performance rate Nvidia had them at they simply didn't do the job needed and resulting in the GPU overheating and destroying its self.
It is entirely, 100% Nvidia's fault. If you put in substandard parts you get a substandard result.
Parent
Re:So, is it not fair (Score:5, Insightful)
\ Long story short, all manufacturers should be held accountable for the idiotic shortcuts they take when it comes to cooling their electronics. Its kind of an important aspect of electronics, no? Why not spend a buck or two more on something that actually does the job? Till then the first thing I do with any graphics card (or CPU for that matter) is still going to be to chuck the stock cooler into my parts bin, and then look for something bigger or better.
Parent
Re:So, is it not fair (Score:4, Funny)
Parent
Re:So, is it not fair (Score:5, Informative)
Why is it all Nvidia's fault, seems to me it should be a shared responsibilty.
I work for a company big into mobile IC design (like NVIDIA). And I can say that it is very likely NVIDIA's fault because they (as do we), as the design company, specify every last detail of process, circuit, and package, when it comes to IC fabrication. Additionally, the company which produced these chips--TSMC--is the oldest, largest, and possibly most reliable dedicated fab company in existence. If there is a heat dissipation problem, it almost certainly stems from engineering oversight or management's corner-cutting on NVIDIA's part.
Parent
Re:The problem extends to other Dells (Score:4, Interesting)
My DELL XPS M1710 has a 7950GTX and never had any issues. The DELL BIOS does have some issues with heat management so I run l8kfan to keep heat at acceptable levels.
On top of that, did you know most new DELL laptops (confirmed on XPS and VOSTRO) wont read S.M.A.R.T? I think heat killed my original hard drive but the BIOS wouldn't report the drive was going bad. They should fire whoever made the decision that removing this feature was an improvement.
Parent