|5 posts found|
OP 8/22/12 10:32:14 AM#1
Something I noted while playing WoW is that 2 cores of a 4 core processor barely get used.
One was averaged at 75%
One around 40%
and the other two at 20%
If this kind of usage is typical of games, wouldn't it be better to stick with dual core processors for gaming?
8/22/12 9:47:26 PM#2
Originally posted by jusomdude
Windows should do it all for you...the game wouldnt control it....mabey years ago i remember games having the option but these days windows does it well enough.
What your seeing is that the game uses really one core, mabey some tasks on the second core...then like backround apps on the other two cores.
Yeah it would be more efficiant to use a dual core...quad core is nice for when you have the browser open or running voice chat or FRAPS ect...allows you to game and do other stuff.
Getting a six core or an i7 for gaming only is a total waste since...you clearly see how games need extra cores and hyperthreading.
Moral of the story: Save $100 on the CPU and just get an i5 and then put that $100 towards a better graphics card...something that will directly relate to gaming preformance
8/22/12 10:52:35 PM#3
Short version: it varies wildly from game to game.
Programmers don't get to decide what part of the program runs on which core. Rather, programmers can break their program into separate threads, and different threads can be run on different cores. The operating system decides when the program runs which threads will go on which processor cores--and takes into account other programs running on the computer at the same time.
In order to logically break a program into separate threads, they need to have little to no dependence on each other. Otherwise, the threads can trip over each other and cause all sorts of problems.
For example, suppose that a program tries to count when how many times some event occurs, and it can occur in any one of several threads. When the event occurs, the thread is supposed to increase the counter by 1, and let's stay it stops the program when the event occurs 100 times. Let's suppose that they share a counter, and whenever the event occurs, they read the counter, add one to it, and store the new value.
What happens if the event occurs in two different threads running on two different processor cores at the same time? Let's suppose that the counter is at 25. The first thread reads the counter value as 25. While it is adding 1 to it, the second thread also reads the counter value as 25. The two threads add 1 to it at the same time, and both get a value of 26. Then they store the value of 26 in the counter, one after the other. The event we're looking for happened twice, but the counter only increased from 25 to 26.
This can also happen if the counter is stored in processor cache. Suppose that one thread sees the event occur, and increases it from 25 to 26 as stored in system memory. Another thread later sees the event occur, but has the old value of 25 stored in processor cache, adds 1 to it, and stores the new value of 26 in system memory. Again, the event occurred twice, but the counter only goes up by 1.
There are ways to get around this, of course. It's possible to make it so that when a thread wants to increment the counter, it locks it first so that no other threads can touch it. Then it reads the value, adds 1 to it, and writes the new value. If another thread wants to access the counter at the same time, it gets told that the counter is locked and it will have to wait. Once the first counter writes the new value of 26, it unlocks the counter, and then the second thread can read the new value of 26 and add 1 to that.
If threads are mostly doing their own thing and only share a counter that they only occasionally have to increment, this is pretty easy to work around. But what if threads have to mostly access the same variables? If you constantly have one thread waiting for another to unlock a variable, then you effectively have fewer threads running at a time. Worse, what if two threads need two variables simultaneously (e.g., they want to multiply the two numbers by each other), and one thread locks one variable and the other thread locks the other? At that point, the program is stuck, with the two threads playing a game of chicken, each refusing to be the first to unlock its variable.
The upshot is that in order to break a program into threads, it's essential to be able to break the computations into a bunch of different blocks that depend very little on each other. That way, separate threads can run simultaneously on separate processor cores without needing to know or care what is going on in the other threads.
But it's not just that you need a lot of separate threads. You need a lot of separate threads that each do a substantial amount of work. If you break things up into 10 threads, but one of the threads has to do 90% of the work, then that one thread might get a processor core all to itself, but the other nine threads (and the rest of the processor cores) will mostly sit there idle.
The real key is that you want the "biggest" thread to do as little work as possible, as that's what limits scaling. If the largest thread has to do a fraction x of the work, then the program will scale properly to 1/x cores. (Since 0 < x < 1, 1/x > 1.) Worse, scaling isn't constant like this, but can change from moment to moment. If two threads do the same amount of work in the long run, but during any particular millisecond, one thread or the other has to do 2/3 of the work, and they switch back and forth as to which thread is busier, then the program only scales to 1.5 cores.
Some algorithms make this much harder to do than others. Some programs are pretty trivial to scale. For example, if you want to compute the number of primes less than 1 billion, you could compute the primes up to 31622, and then you can check to see if any larger number is prime by checking whether it has any factor smaller than 31622. So you could have one thread check the numbers from 31623 to 1 million, another thread check the numbers from 1 million to 2 million, another from 2 million to 3 million, and so forth. Each thread doesn't care what happens in the others, other than having to add its count to the final total at the very end.
If you had a thousand processor cores, you could have a separate thread running on each core mostly in parallel. Some threads would have a little more work to do than others, but the program would easily scale to several hundred cores. In practice, if you make "too many" threads, Windows just runs several threads on the same processor core and makes them take turns. But it would be easy to use all of the cores in any modern desktop with this algorithm.
At the other extreme, suppose that you have an algorithm for which one particular step always requires the output from the previous step. In that case, you're going to be single-threaded, and there's nothing you can do about it, apart from trying to find a more clever algorithm to give you the same final answer.
For games, one big barrier to threading is that you only have a single rendering thread. If you have two different threads sending DirectX or OpenGL API calls to the video card at the same time, then they'll trip over each other and a change made by one thread will make the video card interpret something sent by the other thread differently from what it wanted.
DirectX 11 brought multi-threaded rendering. As far as I know, Civilization V is the only game to implement this so far. And neither AMD nor Nvidia had drivers ready to handle it properly, though both showed major gains in drivers as time passed. I'm guessing that multi-threaded rendering is an enormous pain to write drivers for, which would explain why DirectX didn't implement it earlier, games generally don't use it even now that DirectX 11 offers it, and OpenGL still hasn't implemented it at all.
There are, of course, a lot of other things that a game has to do besides sending API calls to the video card. But those things mostly consist of processing data to determine exactly what gets sent to the video card. For example, the processor determines exactly how far a character has moved in the frame that is to be displayed, and then the rendering thread has to send that data to the video card to tell it where to draw the character.
The real trick to making a game scale well to many processor cores is to make it so that the game rendering thread doesn't have to do very much of the work. Get other threads to do the work of determining where everything has moved, so that the rendering thread can just spam API calls and not have to do major computations to figure out what data it is supposed to send.
Breaking the other work into a bunch of separate threads is likely to be simpler. If you want to know exactly where all of the characters in a game have moved for a given frame, you can have different threads computing what happened to different characters at the same time.
But the real issue isn't just trying to make the game scale to more processor cores. It's pretty trivial to make any program scale to 8 processor cores, simply by creating 7 threads that spam useless junk computations. Rather, the key is to make it so that no particular thread has to do very much work. It's better to have a purely single-threaded game that runs well even on a ULV laptop processor than a game that scales flawlessly to eight cores, but still doesn't run smoothly even on an eight-core processor like an AMD FX-8150.
I realize that this doesn't entirely answer your question. But in a sense. you've asked the wrong question. The right question is, how many cores do you need. A game that can break its work into four equal threads will likely use four processor cores if you've got them. But if each of the threads is small enough that they could all be run on a single core and the game would still run well, then it doesn't really matter that the game scales well to four cores. You wouldn't notice any difference between running it on a dual core processor and a quad core.
And again, this varies wildly from game to game. Guild Wars 1 was so light on processor usage that when I play it on my current computer, the processor declares itself idle and clocks down--without hurting performance. That's why I'm surprised that Guild Wars 2 is so processor-heavy.
If you have a particular game and want to know how well it is threaded, the approach you're looking at doesn't help. The problem is that Windows will move threads around, so even if a game has to do 90% of its work in a single thread, Windows might bounce that thread between all of your cores, so they all show a substantial amount of work being done.
What matters is, do you lose performance if you lose processor cores? If you want to check threading capabilities, go into your BIOS and reduce your CPU clock speed to the minimum allowed. (You can generally do this even on a processor with a locked multiplier.) Then tinker with graphical settings, turning basically everything to the minimum, to try to make the game processor-bound. Then you can measure frame rates repeatedly with different numbers of cores disabled in the BIOS. If you have a quad core processor and get the same frame rates with 3 cores as with 4, but it drops considerably with two cores, then the game scales to three cores. It might well use all four cores under normal circumstances, but it only scales to 3.
Still, the trend is toward better scaling to more processor cores. If you're buying a new computer today, I'd recommend getting a quad core processor. Even if some games don't scale past two cores yet, some do, and it's likely that more games will do so as time passes.
8/22/12 11:43:56 PM#4
Games are getting better about it.
Quiz has the long of it. Programmers can, to some extent, control how well their game runs on multi-core, and can target specific number of cores if they choose. Most target 2 or 3 (as most people have at least dual core CPU's these days). The OS has generic control of what thread goes where, but you can drop down programming-wise and take more direct control if you absolutely have to (things like specifying that AI and Physics don't share a core, or to prevent inordinate core swapping).
It's very very rare to see a game actually exploit 4+ CPU's. Quad-core are becoming more mainstream, but are still somewhat of an outlier, and not many people have more than 4 cores available (yet). The hard part is making it so that you get uniform gameplay on a single core versus a 4+ core machine - maybe not in FPS/Graphics, but at least in AI/game mechanics. And AI/game mechanics are what really tend to scale well to multi-cores, as graphics effects are less CPU bound (and more GPU bound, where they are already massively parallel, out to hundreds of cores). But you need to have the same game mechanics and level of difficulty regardless of what computing capabilities, so you are somewhat at a Catch-22.
CPU's have taken the more core approach once they hit a brick wall in the early 2000's - they really couldn't make single core CPU's much faster (The 3.8Ghz Prescott P4 was the end of the line of the single core desktop CPU - mid 2005). So in order to get more speed, they branched outward and added in more cores. You get more theoretical computation, but only if you can leverage being able to run parts of your program in parallel.
In general, the faster that the individual core is, the better gaming performance you will get. That has been true since basically the dawn of computer gaming. But in the past few years, even a very fast dual core will start to struggle, as people tend to do silly things on their computer (like actually multitask) that demand more resources.
8/22/12 11:48:58 PM#5
Originally posted by jusomdude
It Varies game to game but quite a few games are practically unplayable without 4 cores. APB Reloaded Needs 4 cores & from what I've seen cryengine 3 has horrid performance on Dual core. At this stage I think 4 cores is going to get you more mileage. On Old titles like wow the 40% is probably your OS, 75% is WoW, id expect the other 2 to be lower generally but it does depend on what is running on your box.