Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

[ Create a new account ]

Laptops With Certain NVidia Chips Failing

Posted by timothy on Thursday July 31, @11:02PM
from the hardware-sucks dept.
Eukariote writes "An estimated 18 million laptops with NVidia G84 and G86 graphics chips sold in the past one and a half years are experiencing high failure rates. Various laptop models from multiple manufacturers (Apple, Dell, HP, Lenovo, and others) are affected. NVidia blames it on bad chip packaging causing thermal failure. BIOS updates that turn the laptop fan on more frequently or permanently have been released by Dell and HP. The cynical interpretation is that this is likely to only delay the problem until the warranty has expired."

Related Stories

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • by DogDude (805747) on Thursday July 31, @11:06PM (#24427725) Homepage
    Having to have my laptop fan all of the time to account for a bad chip is an unacceptable fix. It's loud, it takes more electricity to run, and it shortens the life of the fan, and possibly the whole computer as a result.
    • by mysidia (191772) on Friday August 01, @02:27AM (#24429123)

      Note that they conveniently prevent [hp.com] you from downloading the old BIOS to revert the upgrade, by removing old version from their web site, if the increased fan noise is a problem for you. Under the pretense of "avoiding confusion", they will not allow you to get the original version:

      I do not see the previous BIOS version on the HP Support Web? What happened to the previous versions of the BIOS? In order to eliminate any confusion on which BIOS version is the latest, only the latest version is available on the Web.

    • by jamesh (87723) on Friday August 01, @02:59AM (#24429261)

      Ford tried to do this to me with my car. It would make a shuddering noise somewhere in the front end at low speed (eg parking lots). I mentioned it to them each service and they said they'd look at it, and when I got it back after the service they said they'd flushed the power steering system and upgraded the car computer firmware.

      The first service after the warranty expired I took it in and they said that there was a faulty hose causing the problem and it would take $$$ to fix. I got them to fix it under warranty eventually but I wonder how many other people they screwed over...

  • Today's fun fact (Score:5, Insightful)

    by mu11ing1t0ver (1175051) on Thursday July 31, @11:09PM (#24427753)
    "The power of accurate observation is frequently called cynicism by those who don't have it." - George Bernard Shaw
  • by Ethanol-fueled (1125189) * on Thursday July 31, @11:11PM (#24427773) Homepage
    Here are exerpts from the most amusing description [theinquirer.net] of the problem:

    All Nvidia G84 and G86s are bad

    The short story is that all the G84 and G86 parts are bad. Period. No exceptions. All of them, mobile and desktop, use the exact same ASIC, so expect them to go south in inordinate numbers as well. There are caveats however, and we will detail those in a bit.

    Both of these ASICs have a rather terminal problem with unnamed substrate or bumping material, and it is heat related. If you ask Nvidia officially, you will get no reason why this happened, and no list of parts affected, we tried. Unofficially, they will blame everyone under the sun, and trash their suppliers in very colourful language.

    When the process engineers pinged by the INQ picked themselves off the floor from laughing, they politely said that there is about zero chance that NV would change the assembly process or material set for a batch, much less an EOL part.

    For dessert, there's this [theinquirer.net] article to finish :)

  • by Anonymous Coward on Thursday July 31, @11:18PM (#24427835)
    From NVIDIA's Q2 FY2009 Business Update [nvidia.com]:

    Separately, NVIDIA plans to take a one-time charge from $150 million to $200 million against cost of revenue for the second quarter to cover anticipated warranty, repair, return, replacement and other costs and expenses, arising from a weak die/packaging material set in certain versions of its previous generation GPU and MCP products used in notebook systems. Certain notebook configurations with GPUs and MCPs manufactured with a certain die/packaging material set are failing in the field at higher than normal rates. To date, abnormal failure rates with systems other than certain notebook systems have not been seen. NVIDIA has initiated discussions with its supply chain regarding this material set issue and the Company will also seek to access insurance coverage for this matter.

    Regarding the notebook field failures, NVIDIA president and CEO Jen-Hsun Huang stated: "Although the failure appears related to the combination of the interaction between the chip material set and system design, we have a responsibility to our customers and will take our part in resolving this problem. The GPU has become an increasingly important part of the computing experience and we are seeing more interest by PC OEMs to adopt GPUs in more platforms. Recognizing that the GPU is one of the most complex processors in the system, it is critical that we now work more closely with notebook system designers and our chip foundries to ensure that the GPU and the system are designed collaboratively for the best performance and robustness."

    Today's high performance notebooks are highly complex systems with extreme thermal environments. The combination of limited thermal management and frequent power cycling is particularly challenging for complex processors like the GPU.

    Huang added, "This has been a challenging experience for us. However, the lessons we've learned will help us build far more robust products in the future, and become a more valuable system design partner to our customers. As for the present, we have switched production to a more robust die/package material set and are working proactively with our OEM partners to develop system management software that will provide better thermal management to the GPU."

  • by cdance (516169) on Thursday July 31, @11:22PM (#24427863)
    As detailed in this thread [laptopvideo2go.com], the GF8400 has serious performance problems under Vista Aero when running recent driver versions. I wonder if this is related? - i.e. Recent driver updates have down-clocked the GPU leading to bad performance. Dell have however recently acknowledge the problem and is working on a fix.
  • waiting to form.

    Charlie gets it right. Let's see, 18 million notebook machines. Freight each way, plus cost of labor to fix them and the materials needed. Less than $10 a machine! Great, that math stuff. Yup, a $150-200 million charge oughta do it at around $10 a machine!

    Hello? This is the SEC? Hey, I have a question about an 8K I saw for NVidia. It goes like this.....

  • by aztektum (170569) on Friday August 01, @12:05AM (#24428219)

    Sorry, I was distracted by the picture of the BREASTS on TFA page

  • The USAF had a reliability program that ran from the mid-1960s to the mid-1980s which did quite a bit to make electronics more reliable in the field. About 1% of the USAF's "black boxes" were marked with stickers that said something like "USAF Reliability Program Unit - If unit breaks, replace entire unit and send broken unit to ... for analysis".

    When broken units came into the analysis shop, a considerable effort was made to find out exactly which component had failed and how it had failed. This went way beyond normal repair. When a bad part was located, the part was opened up and examined with an electron microscope or X-rayed, as appropriate, to see exactly what had gone wrong.

    The USAF would frequently publish pictures from this program in Aviation Week. You'd see pictures of bad lead joints inside an IC package, too-long internal leads that had failed under high G loads, and bad on-chip etching. Manufacturers of bad parts were named. Inspectors were sent to plants to figure out what had gone wrong with the manufacturing process. The problem got fixed or the supplier stopped getting military contracts.

    This worked well when the military bought most electronic components. By the 1980s, consumer electronics were using electronics at least as sophisticated as the military, and the military had to start using "commercial, off the shelf" components. Today, the USAF has trouble getting any special attention from parts suppliers.

    Auto manufacturers still do things like this. Because they have to pay for recalls, they need to find out why things break and fix the production process, even if it's at a supplier.

  • There is a problem with the chips, there is no doubt about that. However take anything Charlie says about it with a huge truckload of salt. There was a bit of bad blood between Nvidia and Charlie years ago (something like 4 or 5 now), and ever since they've refused to talk to anyone from the Inquirer and Charlie specifically.

    It seems these days that all [theinquirer.net] Charlie [theinquirer.net] does [theinquirer.net] is [theinquirer.net] write [theinquirer.net] long [theinquirer.net] article [theinquirer.net] bashing [theinquirer.net] Nvidia [theinquirer.net]. That is unless he's writing an article that's so over the top that his editor has to pull it [theinquirer.net] (yes, believe it or not, there actually is an editor in charge of all those pieces).

    Go read dell or HP forums and EE times. Read The Inq only if you want some amusement to see how amazingly slanted of a story can be produced.

    • Re:Model numbers (Score:5, Informative)

      by Hemogoblin (982564) on Thursday July 31, @11:22PM (#24427867)

      Here are the Dell models which have BIOS updates, from TFA:

      Inspiron 1420
      Latitude D630
      Latitude D630c
      Dell Precision M2300
      Vostro Notebook 1310
      Vostro Notebook 1400
      Vostro Notebook 1510
      Vostro Notebook 1710
      XPS M1330
      XPS M1530

    • Re:Model numbers (Score:5, Interesting)

      by Anonymous Coward on Thursday July 31, @11:24PM (#24427881)

      A link? Shit I own one. Dell XPS m1330; I've had the motherboard replaced twice already for video failure, and I got the thing in September of 07. Yes, that's right, replaced twice in less than a year.

      The flaw is every bit as bad as everyone makes it out to be.

    • Re:Model numbers (Score:5, Informative)

      by Gideon Fubar (833343) on Thursday July 31, @11:29PM (#24427915) Journal
      Sadly, it's not the laptops that are the problem. The problem apparently exists in all G84 and G86 chips, including those on desktop models.

      This was reported by the inquirer (and here, i think) a few weeks ago, but apparently the news hasn't been getting around..
    • by Manip (656104) on Thursday July 31, @11:38PM (#24427989)

      But it is Nvidia's fault because they signed off on these cooling units.

      That is like saying it isn't your car maker's fault if they put breaks in your car designed for a lawnmower and instead it is obviously the people who are making these lawnmower breaks fault for not making sure they can break a much heavier car...

      From what I'm reading the issue isn't with fans not performing as expected. The issue is that at the performance rate Nvidia had them at they simply didn't do the job needed and resulting in the GPU overheating and destroying its self.

      It is entirely, 100% Nvidia's fault. If you put in substandard parts you get a substandard result.

      • by MachDelta (704883) on Thursday July 31, @11:52PM (#24428115)
        Agreed. Most reference coolers (and even a lot of 3rd party ones) aren't worth the cheap plastic used to make them. When I pulled the ref cooler off my 8800GT last year I was shocked to find that the fan didn't even sit completely atop the core, and that there was a LOT of excess thermal paste and stupidly thick thermal pads. It's little suprise the card was heatsoaking to 90C after a few hours of Bioshock and crashing itself! I can only cringe in horror when I imagine something like that stuffed into a freaking laptop. Fortunatly I had already planned on replacing the stock cooler (just a big heatpipe/heatsink with a 120mm fan ziptied to it) and lo and behold my card now has trouble hitting low 40's even after hours of flogging.

        \ Long story short, all manufacturers should be held accountable for the idiotic shortcuts they take when it comes to cooling their electronics. Its kind of an important aspect of electronics, no? Why not spend a buck or two more on something that actually does the job? Till then the first thing I do with any graphics card (or CPU for that matter) is still going to be to chuck the stock cooler into my parts bin, and then look for something bigger or better.
    • by IorDMUX (870522) <{ude.esac} {ta} {3zdm}> on Friday August 01, @12:50AM (#24428543)

      Why is it all Nvidia's fault, seems to me it should be a shared responsibilty.

      I work for a company big into mobile IC design (like NVIDIA). And I can say that it is very likely NVIDIA's fault because they (as do we), as the design company, specify every last detail of process, circuit, and package, when it comes to IC fabrication. Additionally, the company which produced these chips--TSMC--is the oldest, largest, and possibly most reliable dedicated fab company in existence. If there is a heat dissipation problem, it almost certainly stems from engineering oversight or management's corner-cutting on NVIDIA's part.

    • No, Xbox 360's use an ATi chip.

      Although RoHS probably contributed to the RRoD, mostly it was an improper thermal solution. There was an article awhile back where it was discovered that Microsoft engineers decided to cut costs by designing the heatsink system themselves. Insufficient cooling and an improper mounting system allowed the board to warp more than the RoHS solder could handle. Newer 360's have lots of extra epoxy around the package to keep it from pulling too far away from the motherboard.

    • The lack of lead in solder is a technological issue and as such is solved by more advanced technology. Certainly there are few people here who are opposed to higher technology?

      Sure we can whine about the extra work we are forced to do, or the fact that we have to pay for higher technology, but what good does that do. As technologically savvy people we live for the chance to advance the technology. We see these opportunities all over the place. Smaller cars require innovate means to increase safety and power. Smaller computers require more power efficient components and better batteries. Have one type of plastic go away just opens up a space for innovative new plastics. this is what makes the world exciting.

      So, if some company can't keep up, then they just suck as technologist and need to go away. A car company can't make technologically advanced cars, screw them. A video card manufacturer can't keep up with the trends and make a reliable video card, screw them too. I have involved in a number of situations where the process had to be rethought. Someone whines that a baby might be born with defect and we can't use this chemical. Someone complains that the dust will give them cancer and we must use a hood. Someone complains that we can't reliably dispose of an agent, and we must switch agents. Sure, we could say who cares if some worker dies. So what? But in each case the change was made, and technology gave us an equal or better solution.

      It is always easier to blame failure of the external forces rather than taking responsibility for a personal lack of creativity. This change is solder is not the first scape goat used by the those that lack innovative solutions, and won't be the last. There will always be firms that say a problem can't be solved, and they will be generally over thrown by those who then find the solution. I think that any number of lazy American firms are discovering that right now, while others are riding the way of can-do innovations.

    • My MacBookPro turned on one morning, and everything worked but the display. I managed to log in, launch iTunes and play some music, but no graphics output. A trip to the Apple store later and I'm out a machine for a week. Never had an explanation, but now I am curious if i should send it back and ask for a new logic board with a graphics chip that isn't going to fail again prematurely due to faulty design.

      Well, unless your replaced logic board fails again, I don't think Apple would take it back for replacement, since it basically works. Unfortunately, the affected GPUs are basically the entire nVidia 8x00 line (except for desktop 8300, and all the 8800's). Very few laptops actually use the 8800M GPU (think gaming laptops), so any other replacement, even a new laptop with an nVidia chipset will likely have the problematic GPU. The other alternative is to find a laptop with an AMD/ATi or Intel GPU.