How Many Android OEMs Cheat Benchmark Scores? Pretty Much All of Them 189
An anonymous reader writes "After Samsung got caught out cheating on benchmarks (Note 3, Galaxy S4) AnandTech has done a detailed analysis of the state of benchmark cheating amongst Android OEMs. With the exception of Motorola, literally every single OEM they've looked at ships (or has shipped) at least one device that does benchmark-specific CPU optimizations. AnandTech also thinks it will get worse before it gets better. 'The hilarious part of all of this is we’re still talking about small gains in performance. The impact on our CPU tests is 0 - 5%, and somewhere south of 10% on our GPU benchmarks as far as we can tell. I can't stress enough that it would be far less painful for the OEMs to just stop this nonsense and instead demand better performance/power efficiency from their silicon vendors.' The article notes that Apple doesn't do any of the frequency gaming stuff."
Easy solution (Score:5, Interesting)
Re: (Score:3)
So then you just look for behavior. Oh look a bunch of unnatural activity, change the governor to performance instead of on_demand.
Re: (Score:2)
So, make the benchmark software resemble the composite behaviour of common classes of apps. OGL benchmarks effectively act as the "typical" 3D game and so on.
At which point cheating becomes pointless, as any tweaking in favour of performance on those benchmarks immediately hit you as worse batterly life and high temps for all users running any similar kinds of apps, including your reviewers.
Re: (Score:2)
You want something that's consistent and reliable from device to device and run to run; that consistency dooms any attempt to cloak the behaviour of the app to failure.
Re: (Score:3)
Oh look a bunch of unnatural activity, change the governor to performance instead of on_demand.
That shouldn't make that much of a difference... if the governor is in the low_power setting, limiting the CPU to minimum clock speed, then there'll be a performance hit on benchmarks, but with on_demand, by the time the system has realized it needs to change the governor to performance, the clock speed will have already been increased to maximum due to CPU load. Meanwhile, you're wasting clock cycles monitoring what stuff is doing....
Re: (Score:2)
I was thinking to cheat battery test for example or have it lift the max CPU clock above what is normally used.
Re: (Score:3)
Fair enough, And not an unreasonable measure.
Truthfully, and I know I'm in the minority in a discussion about benchmarks, cell phones have been "good enough" for quite a while, and I bought mine based on feature list and price. I bought the cheapest phone that had all of the features I wanted: FM radio, GPS, bluetooth, wifi tether, and that's it. Didn't even care whether it had 4G connectivity (though it did). Even a bottom-end $100 smart phone with an 800MHz single core processor is in the performance rang
Re: (Score:2)
I totally agree. I want a better display, more color accurate and such more than faster CPU.
Re: (Score:2)
That's an excellent idea. Now all you have to do it look for a way to do it with almost no overhead.
Re: (Score:2)
The goal is to make detection so complicated that the simple act of detecting a benchmark reduces performance to levels impossible to compensate by cheating.
Re: (Score:2)
Battery life measurements should take place with the benchmark app running.
Or, alternatively (Score:5, Insightful)
Re: (Score:3, Interesting)
On the contrary, the testers should just be bigger dicks. "We detected benchmark-specific optimizations in products #1, #2 and #3, so they all got zero points."
Re: (Score:2)
You think there's fanbois here, imagine how torn up they'd get if they did that.
Re:Or, alternatively (Score:5, Insightful)
On the contrary, the testers should just be bigger dicks. "We detected benchmark-specific optimizations in products #1, #2 and #3, so they all got zero points."
That seems quite arbitrary. What about "To test the battery, we tested how many minutes the battery lasts while running benchmark X". The cheaters will get shorter battery life.
Re: (Score:3)
Why would they get shorter battery life?
They can't artificially boost performance without a drain on the battery.
Re: (Score:2)
I find that I get better MPG on I-95 than on I-495 or I-695.
Re: (Score:2)
Re: (Score:2)
I'm a little confused about this. I was under the impression that what they were doing was ensuring that the clock speed was running at full, not slowed down for power saving, etc. From my point of view, that would just give a consistent reading of how fast the phone could run and wouldn't be considered 'cheating' unless they were boosting the clock speed up over normal running speeds. I assume some games and other applications also force the processor to full, but perhaps this is not the case. The whole t
Wrong, they are boosting clock speed above normal (Score:3, Informative)
I was under the impression that what they were doing was ensuring that the clock speed was running at full, not slowed down for power saving, etc.
No. They are running at a clock speed that no real application will see under any circumstance, either the GPU or CPU cock increased.
It is within what the parts are rated for but not what the device was built to run at normally.
I assume some games and other applications also force the processor to full
There is no way to build a game on Android that can run at the
Re: (Score:2)
There is no way to build a game on Android that can run at the speed the benchmarks are getting run at on each of the devices "cheating".
What happens if you name your game to match the benchmark that the phone is looking for?
Re: (Score:2)
Thanks, as I said, I was under the impression they were just ensuring it was not reduced. If there's no way an app could get the same speed then it is most definitely cheating. I'm surprised they don't overclock. If you're going to cheat you might as well do it right.
Re: (Score:2, Informative)
No, he was right. The phones CAN and DO reach that clock speed. Read the AnandTech article. The graph of CPU speed shows it quite clearly.
Re:Wrong, they are boosting clock speed above norm (Score:5, Funny)
either the GPU or CPU cock increased.
Whoa, can cell phones do that now? I hold these things to my ear, for Christ's sake!
Re: (Score:2)
either the GPU or CPU cock increased.
Whoa, can cell phones do that now? I hold these things to my ear, for Christ's sake!
Yeah, they'll f**k your brains out... literally.
Re: (Score:2)
Re:Or, alternatively (Score:4, Insightful)
Just buy an iPhone. Apple doesn't cheat apparently.
Re:Or, alternatively (Score:5, Informative)
Benchmarks are about as useful as manufacturer spec sheets. Take both with a a few metric tonnes of salt.
Re: (Score:3)
Pfff. Car manufacturers tape up the air intakes and door seams on their cars to do fuel economy runs, just to eek out the every last 0.1mpg. Running your car like that for any reasonable period of time would wreck the engine pretty quick.
Does any manufacturer report their own measured mileage? Within the US, the only number that matters is measured by the EPA, on a dynamometer.
Re:Or, alternatively (Score:5, Informative)
Actually, manufacturers do report their own efficiency numbers, and the EPA spot-checks them.
http://business.time.com/2012/12/10/more-reason-to-be-skeptical-about-new-car-mpg-claims/ [time.com]
Re: (Score:3)
Actually, manufacturers do report their own efficiency numbers, and the EPA spot-checks them.
http://business.time.com/2012/12/10/more-reason-to-be-skeptical-about-new-car-mpg-claims/ [time.com]
Not only that, but they are allowed to use the same numbers if the drivetrain and weight of the vehicle is the same as to a previously tested vehicle.
Re: (Score:2)
you're assuming the benchmark authors aren't working directly with the hardware manufacturers.
Comment removed (Score:5, Insightful)
Re: (Score:2)
The idea that people should have to come up with less-reliable, improvised tests because the hardware manufacturers are going to crawl over each other to cheat any consistent, scientific one is kind of depressing. You kind of know that all these phone companies are first-rate bullshit artists, with Samsung running at the fore, but the idea even that their engineers are going out of their way to fake their product to the top of the specs pile is just sad.
Re: (Score:2)
Re: (Score:2)
OIn the contrary, they should come up with more meaningful tests. On the x86 side, there are quite a few staples of testing that are also real-world scenarios, like RAR decompression, encryption and video and mp3 encoding. For GPUs, an assorted bundle of real games are great for measuring performance.
Re: (Score:2)
Mechanical Turk Benchmarks (Score:3)
Instead of automated benchmarks of hardware, why not real world human benchmarking where a group of people is given a set of tasks to do on a given cell phone platform and see who can do them faster?
Automated technical benchmarks make sense when the workloads more or less approximate the benchmark -- video gaming, 3D modeling, disk throughput, etc.
But unless I'm living totally in the dark, most people aren't buying cell phones oriented towards single-task performance (eg, gaming). They get used for many ta
Comment removed (Score:4, Informative)
Re: (Score:2)
Better solution. No benchmark test should be accepted unless the software was written AFTER the release of the hardware being tested. And then, yes, randomize various things including the compiled code.
Re: (Score:2)
Alternatively, can we get an app that changes the name of any game to one of the benchmarks' names?
Re: (Score:2)
Can you even do that on Android? Generally to rename the package on Android you need to take the APK apart and rename the DEX files there.
Right now, it looks like the most impressive tests are ones done with the browser purely because to cheat three means generally it works for everyone.
(Though even then I can see why they'd cheat - an iPhone 5s dual core processor can keep up with quad core SoCs running nearly twice as fast (1.3GHz A7 vs. 2.
Re: (Score:2)
That sounds like a great idea, but that would just favor phones with larger batteries. At the end of the day, the benchmark wouldn't accurately represent real life use. Which is the only reason to look at them, unless you're a fan of bar charts and statistics.
And Apple (Score:5, Informative)
With the exception of Motorola...
And Apple. Apple and Motorola/Google are the only two companies that don't boost their devices for benchmark tests. If you're going to give credit to one, please do be fair and give credit to the other.
I respect both of them for that level of integrity and I hope they stick to their guns and remain honest.
I may be an Apple fanboy (and I am) but I'm really looking forward to seeing what Motorola starts releasing in about a year once Google's able to, as they said, flush things out of the system and start releasing truly Google-designed products.
Re: (Score:3, Interesting)
Re:And Apple (Score:4, Interesting)
I've heard a number of people I trust comment on how the Moto X just feels really good in the hand and how the screen and size is just right. So the phone is certainly not bad at all, and most of us don't actually buy the high-end phones any more than we buy the high-end cars or bikes. A really good mid-range phone is exactly what I want; I'm already making calf-eyes toward the coming Sony Z1 Mini.
What's hurting Motorola for me, personally, is that it's simply not on sale in most of the world, and seems unlikely to ever be. It's not just about being able to get it where I live, but having a phone designed from the start to be usable in all major regions of the world as I travel.
Re:And Apple (Score:4, Interesting)
Re: (Score:2)
Posting anonymously since I'm a motoroogle employee... you'll be disappointed. I certainly am. At this point, I expect google to shut us down or spin us off.
Care to elaborate on why we'll be disappointed? I ask because this thread is focused on (perceived) performance of smartphones, and I know Motorola caught some criticism for not giving the Moto X better silicon to compete with other flagships, but personally, I think their tactic is pretty smart: focusing on functionality for the average user instead of performance capabilities which most people don't care about. I'd actually consider buying a Moto X if it weren't for the non-removable battery (deal breake
Re: (Score:2)
With the exception of Motorola...
And Apple. [...] If you're going to give credit to one, please do be fair and give credit to the other.
They did, and it's in the summary. You didn't even have to read TFA. Additionally, the headline narrows this down to Android OEMs, so that's why Apple was excluded from the discussion until the very end.
The article notes that Apple doesn't do any of the frequency gaming stuff.
Re:And Apple (Score:4, Informative)
Re: (Score:3, Insightful)
I fear Motorola is dead.
The Moto X was supposed to be a device with high amounts of post Google input. It is a high priced mediocre mid range device. They still cater to carriers unlike Apple. They need to compete with the 5S, One X and S4. Instead they are competing with last years product.
Moto needs to copy Apple in some things. Make 1 main device, sell the old one or a cheaper version as well. Max 2-3 devices. Refresh them once a year. This lets third parties make all kinds of doodads for the device. Put
Re:And Apple (Score:5, Interesting)
It's not a mid-range device. It's only mid-range if you look at the spec sheet and nothing else. Its (non-gamed) benchmarks are actually pretty good for all this talk of 'mid-range'. They did the same thing Apple did and tried to balance out performance with battery life. They didn't put the biggest screen in it, and they have optimised silicon to listen for commands without keeping the CPU on all the time.
Specs aren't the war that anyone should be trying to win in the mobile space. That kind of thinking is why there are phones that only last half the day.
Re: (Score:2)
It is a mid range device if you hold one. Go get one and tell me it is not a midrange device.
I am not suggesting chasing specs, I am suggesting building a premium device.
Re: (Score:3)
It can't be worse than the Samsung's I've handled. And the HTC One is premium and beautiful, but that doesn't seem to be working out.
I will accept that you personally are calling it a mid-range device on the build quality alone, but so much of the commentary has focussed on how it has 'mid-range specs' and is overpriced for the screen/CPU/GPU that's in there.
There are trade-offs to be had. iPhones are top-tier devices, but they've got smaller screens than the top-tier Android phones. The S4 is a plasticky t
Re: (Score:2)
The One is a really nice device. If they would kill SenseUI already it would be the best Android device on the market.
Not only build quality but some specs matter, think less about things like resolution and more about how accurate colors are or the white balance on the device is. Pentile vs RGB IPS is a good example of this sort of thing.
The S4 might have a lot of plastic, but it makes it rather durable. They should have had the S4 Active as the only model. Instead they made that AT&T exclusive. It is
Re: (Score:2)
Good point, the article should have congratulated all the companies that were not caught cheating on android performance tests:
Microsoft, ...
Walmart,
Berkshire-Hathaway,
Lockheed Martin,
that "self employed" beggar across the street,
Burger King,
Ford Motor Company,
That could take awhile actually, lets just congratulate the company you're a fan-boy of then shall we?
Re: (Score:2)
It's funny that the number of iPhones those analysts claim Apple really sold, is exactly the same as the amount of iPhones they blind-ass-guessed would sell before the sales figures came out. It's almost like they were trying to defend their original estimates and therefore their reputations as analysts.
Re:And Apple (Score:4, Insightful)
And a little deeper analysis will show....
a) Apple accounted for numbers the same way they always did.
b) Apple does not count phones as sold if they are still in Apple Stores inventories.
c) If Apple were channel stuffing, phones would readily be available at carriers.
The fact is that the "analyst" were wrong and are trying to cover their tails.
Do you really believe that there is not enough of a demand in the launch countries to sale 9 million phones in one weekend?
Do you think that Apple will be forced to take a write down (like MS and BlackBerry) for unsold inventory in the channel?
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I'm not going to bother to follow the rest of your links sorry.
CPU Benchmark Shenanigans ROI?? (Score:2, Informative)
'The hilarious part of all of this is we’re still talking about small gains in performance.'
The even more hilarious part is that OEMs are going to the trouble to do this when CPU benchmark scores are a very small factor in the decision of most consumers to buy these phones. I doubt that the ROI is higher than say, oh, improving the user experience of the GUI or call QoS.
Hilarious... (Score:4, Insightful)
The hilarious part of all of this is we’re still talking about small gains in performance.
The hilarious part of all this is that most people really don't give a rat's as about performance when selecting a phone or even a tablet. The criteria are things like: how does it handle? How intuitive is the UI? Can I watch my favorite online video feeds on this thing? Are any buttons in annoying palaces? What's the price? Does this thing have software to view and edit MS Office files I get sent by mail? The only performance tests these smartphone and tablet things usually get is playing around with a display example in the shop and seeing if the UI is nice and snappy. Nobody excepts tech nerds gives a rats ass that a Samsun Galaxy 4 get a few more FPS in Modern Combat than an iPhone 5.
Re: (Score:2)
Nobody cares about the benchmarks quantitatively, but when you're putting down something like $700 on a computing device getting something that's "the fastest" is a big driver of decisions. I'm sure that guaranteeing a bunch of "Samsung Galaxy blah takes performance crown from Apple iPhone whatevs" stories in the tech press on the basis of a 0.5% difference in the Zootybench score is exactly why Samsung does this crap.
Re: (Score:2)
Naw, I would argue that having the fastest phone was only a deciding factor when Apple started boasting about their new 64bit CPU in the iPhone 5s. Prior to this month the average consumer would not even know or care if their phone had 1, 2 or 4 cores or how many goggleflops it was able to perform.
Apple decided the only way they can differentiate their iPhone 5s amidst all the comments that they are no longer innovative is to create a competitive market based on useless CPU performance numbers, just like w
Re:Hilarious... (Score:4, Insightful)
Apple decided the only way they can differentiate their iPhone 5s amidst all the comments that they are no longer innovative is to create a competitive market based on useless CPU performance numbers, just like what Apple did with Retina displays. Before Retina, nobody cared about pixel density. Before A7, nobody cared about CPU performance or its bittyness. Before the iPhone 5s camera, nobody cared about the size of the CCD pixel on their phone camera.
Dude, It's not that long ago that articles like this slashvertisment [slashdot.org] were plastered all over the web accompanied by comments filled with enthusiastic boasts by hoards of Android fans detailing how iPhone performance sucks ass. Even if blisteringly fast benchmark performance is pretty low down on the list of most people out to buy a smartphone I still can't fault Apple for trying to put a sock in the collective mouth of the Android community. There is a certain personal satisfaction to be had from making the choir of hard core Android fans shut up about benchmarks until Samsung comes up with a still faster device (hopefully free of benchmark cheating this time) even if the customers will probably hardly notice this stupid pissing contest.
Re: (Score:3)
I might not care as much about performance as I care about battery life, and a rating of that is performance/Watt.
Re: (Score:2)
I might not care as much about performance as I care about battery life, and a rating of that is performance/Watt.
Ditto, that's probably the only part of a benchmark test that I really care about.
Re: (Score:2)
The hilarious part of all this is that most people really don't give a rat's as about performance when selecting a phone or even a tablet. The criteria are things like: how does it handle? How intuitive is the UI? Can I watch my favorite online video feeds on this thing? Are any buttons in annoying palaces? What's the price? Does this thing have software to view and edit MS Office files I get sent by mail? The only performance tests these smartphone and tablet things usually get is playing around with a display example in the shop and seeing if the UI is nice and snappy. Nobody excepts tech nerds gives a rats ass that a Samsun Galaxy 4 get a few more FPS in Modern Combat than an iPhone 5.
Don't forget "Does it start with an 'i'". For some people that is the strongest factor. And it could be a plus or a minus factor.
Re: (Score:3)
Here are some tests I've done on my devices compiling Povray [povray.org] 3.6 (single threaded) and rendering the benchmark scene. I was rather surprised that my new Nexus 4 (Qualcomm S4) is slower per core than my SGS II (Samsung Exynos 4). Yet the Nexus 4 doesn't feel any slower and is my main phone now due to the larger and higher resolution screen.
Athlon II x4 (2.8 GHz): 179.82 pps ; 64.22 pps/GHz
Exynos 5 (1.7 GHz): 77.36 pps ; 45.51 pps/GHz (-mfloat-abi=hard -mcpu=cortex-a9 -mthumb -mthumb-inter
The best part (Score:4, Funny)
If you measure it, it gets better (Score:5, Insightful)
You can't measure everything, so you're best bet is to try to keep the measurement methods secret and change them frequently. Unless, of course, your measurements are intended to improve a particular area, then by all means, measure on.
Meh. (Score:2)
This has been going on since benchmarks were a thing. So a long time.
GPU in particular had a bad rep for this. However they actually got the benchmark software altered!
Simply checking for a process name and OC, isn't exactly all that sneaky. Maybe a bit unethical, but still. Whoever is running the benchmark could easily check for that. Of course it depends on how closed down the OS is for inspection also.
Re: (Score:2)
Apple does not need to lie. (Score:2)
Native Code Execution, it runs as fast as it is going to run.
Re: (Score:2)
Re:"Pretty Much All of Them" (Score:4, Funny)
Step #1
Hype new iPhone
Step #2
Release new iPhone
Step #3
Immediately release new iOS update.
Step #4
Watch existing iPhone users complain after the iOS update cripples older models.
Step #5
Laugh maniacally after existing iPhone users stand in lines waiting for new uncrippled iPhones.
Step #6
PROFIT
Re: (Score:2)
IOS 7 runs on pretty much everything newer than the iPhone 3GS.
iPhone 4 runs IOS7 (2010 design).
iPhone 4S runs IOS7 (2011 design) and is still being sold.
What 2011 Android phone runs the most current version of Android and is STILL being sold?
Re: (Score:2)
What difference does that make?
Re: (Score:2)
I was asking: what difference does it make that only Apple makes iOS devices, when those devices run the same benchmarks as anyone else?
You seem to have answered another question entirely.
Re: (Score:2)
Well, to answer your question, The difference is that Android manufactores compete with other Android manufactorers. There is more pressure to compete within the android ecosystem, than the Apple. Right now, most people decide Apple or Android, then choose the particular device within the ecosystem. That's where they would turn to a benchmark, if they were to use one at all. IMHO, they're stupid to use one at all because they don't measure real world performance.
Re: (Score:2)
I dunno, for the areas where compute performance matters the most - i.e. games - I find that there's a lot of parity between the software ranges on Android and iOS, so there's a lot of reasons to compare across ecosystems. I'm an iOS owner now and I'm being swayed by the choice between a Nexus that's guaranteed to have top of the line specs and therefore a long gaming life ahead of it, or paying the same money for a rather crusty older-model iPhone.
Re: (Score:2)
There are cross-platform benchmarks [primatelabs.com] comparing performance across different phone OSes; it would be in Apple's best interests to cheat on those.
Re:"Pretty Much All of Them" (Score:5, Insightful)
No, not really. Comparing iOS and Android directly on performance is silly. They're two totally different ecosystems and hard performance numbers don't change much. That's like a typical user picking a Mac or Windows PC because one performs 5% better at random tasks, ignoring the fact that the offerings between each machine is radically different and pure performance numbers are only a tiny part of the whole picture.
Apple has no reason to cheat because they have no competition that merits the risk of cheating on. It might have been a different story had iOS hardware been available from multiple vendors.
Re: (Score:2)
Why would it be "silly"? If the point of benchmarking is to compare "like" things, and the same game is written for both ecosystems, why wouldn't the concerned consumer want to know that game X runs 20% faster on device Y, regardless of whether device Y is android or iOS? The only people concerned with these benchmarks must be looking for that 5% difference. So if that's what they want, then knowing that another platform gets them that 5% should be just as important as knowing the performance spread among
Re:Probably wont get better (Score:4, Insightful)
They started that game long before you could get 1 GB hard drives.
Re: (Score:2)
Was it really as many gigabytes as it was advertised?
Yep.
You're not confusing gigabytes with gibibytes are you?
The only people who ever used powers of 1024 were RAM manufacturers since it makes sense there.
Do you also complain that 100mbit ethernet is not 100*1024*1024 bits per second?
Re: (Score:2)
Re: (Score:2)
Do you also complain that 100mbit ethernet is not 100*1024*1024 bits per second?
My ISP, as sucky as it is, doesn't cheat here. Heck, it's only disk manufacturers who routinely do.
You don't change an unit in wide use for more than six decades just because some committee feels they don't get enough attention.
Re: (Score:2)
The only people who ever used powers of 1024 were RAM manufacturers since it makes sense there.
Well, them and every software engineer on the planet.
(Yes, I think in K. And yes, I will continue to define 1k as 1024, no matter how many pedantic monkeys object)
Re: (Score:2)
Considering that hard disk sectors have ALWAYS been a power-of-2 even though the encoding of them may not, using a power of 10 was pure marketing bullshit.
> You're not confusing gigabytes with gibibytes are you?
That stupid term was invented years after the fact of common usage. You can piss off with that retarded term.
> Do you also complain that 100mbit ethernet is not 100*1024*1024 bits per second?
That includes meta-data such as parity, error detecting, etc. There is NO confusion because it has alw
Re: (Score:2, Informative)
The discrepancy is deliberate. For a long, long time drive capacity was quoted in the same units that the computer used for storage: binary SI prefixes, not decimal ones. The change to "1 megabyte = 1 million bytes" didn't set in until the 2000s.
Kudos to Apple for making their specs and their OS use consistent units, but it's still a marketing bullshit decision.
Re: (Score:2)
Yes, I'm sure it's all lies where it conflicts with $priorbeliefs.
Re: (Score:2)
Re: (Score:2, Informative)
The performance quoted simply is not available to apps that are not on a whitelist of benchmark applications. It literally does not represent any part of the phone's non-benchmarking performance.
Re: (Score:2)
So it's not cheating to detect when a specific benchmark app is running, and then clock the device to a setting that in no way will ever be set when not running those specific benchmark apps.
Okay. I guess it's not cheating to pump yourself full of steroids right before the start of a professional baseball season either.
Re: (Score:2)
I'm not sure that your argument makes any sense.
"You cheated. You copied the answer key."
"I did the whole test, though!"
Re: (Score:3)
You do realize that this is a battery operated device, and that such 'tweaking' dramatically impacts that battery life. Couple that with them not reporting the actual battery life while running the GPU/CPU coverclocked, and you are essentially lying out of you eye teeth.
Re: (Score:3)
You do realize that this is a battery operated device, and that such 'tweaking' dramatically impacts that battery life. Couple that with them not reporting the actual battery life while running the GPU/CPU coverclocked, and you are essentially lying out of you eye teeth.
This may not be as straightforward as you think- when batteries are cold, they have a different "extractable" capacity compared to when they are hot. Running the CPU/GPU at full tilt is going to warm up the battery. It is kind of a guess as to how much is actually left in there, which could easily explain this.
Re: (Score:2)
The key fact is that none of the other applications get to run under those conditions (though Samsung gets squishy on this point) and so the cheated benchmark ends up representing an unrealistic performance level.