Trending Games | Star Citizen | Warhammer 40K: Eternal Crusade | Landmark | Guild Wars 2

  Network:  FPSguru RTSguru
Login:  Password:   Remember?  
Show Quick Gamelist Jump to Random Game
Members:2,920,375 Users Online:0
Games:760  Posts:6,311,439
Recent forum postsRSS
Active threads
Cloud view
List all forums
General Forums
Developers Corner General Discussion
Popular Game Forums
Click a status to find game forum
Game Forums
Click a letter to find game forum
A-C
2029 Online 2112: Revolution 2Moons 4Story 8BitMMO 9 Dragons A Mystical Land A Tale in the Desert III A3 ACE Online ARGO Online Aberoth Absolute Force Online Absolute Terror Achaea Adellion Aerrevan Aetolia, the Midnight Age Age of Armor Age of Conan Age of Empires Online Age of Mourning Age of Wulin Age of Wushu Aida Arenas Aika Aion Albion Online Alganon All Points Bulletin (APB) Allods Online Altis Gates Amazing World Anarchy Online Ancients of Fasaria Andromeda 5 Angels Online Angry Birds Epic Anime Ninja Anime Pirates Anime Trumps Anmynor Anno Online Applo Arcane Hearts Arcane Legends ArchLord ArcheAge Archeblade Archlord X Ascend: Hand of Kul Asda 2 Asda Story Ashen Empires Asheron's Call Asheron's Call 2 Astera Online Astonia III Astro Empires Astro Lords: Oort CLoud Asura Force Atlantica Online Atriarch Aura Kingdom Aurora Blade Auto Assault Avatar Star Battle Dawn Battle Dawn Galaxies Battle for Graxia Battle of 3 Kingdoms Battle of the Immortals Battlecruiser Online Battlestar Galactica Online Battlestar Reloaded Beyond Protocol Black Aftermath Black Desert Black Gold Black Prophecy Black Prophecy Tactics: Nexus Conflict Blacklight Retribution Blade & Soul Blade Hunter Blade Wars Blazing Throne Bless Blitz 1941 Blood and Jade Bloodlines Champions Boot Hill Heroes Borderlands 2 Borderlands: The Pre-Sequel Bound by Flame Bounty Bay Online Brain Storm Bravada Bravely Default Bravely Second Brawl Busters. Brick-Force Bright Shadow Bullet Run Business Tycoon Online CTRacer Cabal Online Caesary Call of Camelot Call of Gods Call of Thrones Camelot Unchained Canaan Online Cardmon Hero Cartoon Universe CasinoRPG Cast & Conquer Castle Empire Castlot Celtic Heroes Champions Online Champions of Regnum Chaos Online Child of Light Chrono Tales Citadel of Sorcery CitiesXL Citizen Zero City of Decay City of Heroes City of Steam City of Transformers City of Villains Civilization Online Clan Lord Clash of Clans Cloud Nine Club Penguin Colony of War Command & Conquer: Tiberium Alliances Company of Heroes Online Conquer Online Conquer Online 3 Continent of the Ninth (C9) Core Blaze Core Exiles Corum Online Craft of Gods Crimecraft Crimelife 2 Cronous Crota II Crusaders of Solaris Cultures Online Cyber Monster 2 Cyberpunk 2077 Céiron Wars
D-F
D&D Online DC Universe DK Online DOTA DOTA 2 DUST 514 DV8: Exile Dalethaan Dance Groove Online Dark Age of Camelot Dark Ages Dark Legends Dark Orbit Dark Relic: Prelude Dark Solstice Dark Souls 2 Dark and Light DarkEden Online DarkSpace Darkblood Online Darkest Dungeon Darkfall Darkfall: Unholy Wars Darkwind: War on Wheels Das Tal Dawn of Fantasy Dawntide DayZ Dead Earth Dead Frontier Dead Island Dead Island 2 Dead Island: Riptide Deco Online Deep Down Deepworld Defiance Deicide Online Dekaron Demons at the Horizon Desert Operations Destiny Diablo 3 Diamonin Digimon Battle Dino Storm Disciple Divergence Divina Divine Souls Divinity: Original Sin Dofus Dominus Online Dragon Age: Inquisition Dragon Ball Online Dragon Born Online Dragon Crusade Dragon Empires Dragon Eternity Dragon Fin Soup Dragon Nest Dragon Oath Dragon Pals Dragon Raja Dragon's Call Dragon's Call II Dragon's Prophet DragonSky DragonSoul Dragona Dragonica Dragons and Titans Drakengard 3 Dream of Mirror Online Dreamland Online Dreamlords: The Reawakening Drift City Duels Dungeon Blitz Dungeon Fighter Online Dungeon Overlord Dungeon Party Dungeon Rampage Dungeon Runners Dungeon of the Endless Dynastica Dynasty Warriors Online Dynasty of the Magi EIN (Epicus Incognitus) EVE Online Earth Eternal Earth and Beyond Earthrise Eclipse War Ecol Tactics Online Eden Eternal Edge of Space Einherjar - The Viking's Blood Elder Scrolls Online Eldevin Elf Online Elite: Dangerous Embers of Caerus Emil Chronicle Online Empire Empire & State Empire Craft Empire Universe 3 EmpireQuest Empires of Galldon End of Nations Endless Ages Endless Blue Moon Online Endless Online Entropia Universe EpicDuel Erebus: Travia Reborn Eredan Eternal Blade Eternal Lands Eternal Saga Ether Fields Ether Saga Online Eudemons Online EuroGangster EverEmber Online EverQuest Next EverQuest Online Adventures Evernight Everquest Everquest II Evony Exarch Exorace F.E.A.R. Online Face of Mankind Fairyland Online Fall of Rome Fallen Earth Fallen Sword Fallout 4 Fallout Online Family Guy Online Fantage Fantasy Earth Zero Fantasy Realm Online Fantasy Tales Online Fantasy Worlds: Rhynn Faunasphere Faxion Online Fearless Fantasy Ferentus Ferion Fiesta Online Final Fantasy Type-0 HD Final Fantasy XI Final Fantasy XIV Final Fantasy XIV: A Realm Reborn Firefall Fists of Fu Florensia Flyff Football Manager Live Football Superstars Force of Arms Forge Forsaken Uprising Forsaken World Fortnite Fortuna Forum for Discussion of Everlight Freaky Creatures Free Realms Freesky Online Freeworld Fung Wan Online Furcadia Fury Fusion Fall
G-L
GalaXseeds Galactic Command Online Game of Thrones: Seven Kingdoms Gameglobe Gate To Heavens Gates of Andaron Gatheryn Gauntlet Gekkeiju Online Ghost Online Ghost Recon Online Gladiatus Glitch Global Agenda Global Soccer Gloria Victis Glory of Gods GoGoRacer Goal Line Blitz Gods and Heroes GodsWar Online Golemizer Golf Star GoonZu Online Graal Kingdoms Granado Espada Online Grand Chase Grand Fantasia Grepolis Grimlands Guild Wars Guild Wars 2 Guild Wars Factions Guild Wars Nightfall H1Z1 Habbo Hotel Hailan Rising HaloSphere2 Haven & Hearth Hawken Heart Forth Alicia Hearthstone: Heroes of Warcraft Helbreath Hellgate Hellgate: London Hello Kitty Online Hero Online Hero Zero Hero's Journey Hero: 108 Online HeroSmash Heroes & Generals Heroes & Legends: Conquerors of Kolhar Heroes in the Sky Heroes of Atlan Heroes of Bestia Heroes of Gaia Heroes of Might and Magic Online Heroes of Thessalonica Heroes of Three Kingdoms Heroes of the Storm Hex Holic Online Hostile Space Hunter Blade Huxley Icewind Dale: Enhanced Edition Illutia Illyriad Immortals USA Imperator Imperian Inferno Legend Infestation: Survivor Stories Infinite Crisis Infinity Infinity Iris Online Iron Grip: Marauders Irth Worlds Island Forge Islands of War Istaria: Chronicles of the Gifted Jade Dynasty Jagged Alliance Online Juggernaut Jumpgate Jumpgate Evolution KAL Online Kakele Online Kaos War Karos Online Kartuga Kicks Online King of Kings 3 Kingdom Heroes Kingdom Under Fire II Kingdom of Drakkar Kingory Kings Era Kings and Legends Kings of the Realm KingsRoad Kitsu Saga Kiwarriors Knight Age Knight Online Knights of Dream City Kothuria Kung Foo! Kunlun Online Kyn L.A.W. LEGO Universe La Tale Land of Chaos Online Landmark Lands of Hope: Redemption LastChaos League of Angels League of Legends - Clash of Fates Legend of Edda: Vengeance Legend of Golden Plume Legend of Grimrock 2 Legend of Katha Legend of Mir 2 Legend of Mir 3 Legendary Champions Lego Minifigures Online Lichdom: Battlemage Life is Feudal Light of Nova Lime Odyssey Line of Defense Lineage Lineage Eternal: Twilight Resistance Lineage II Linkrealms Loong Online Lord of the Rings Online Lords Online Lords of the Fallen Lost Saga Lucent Heart Lunia Lusternia: Age of Ascension Luvinia World
M-Q
MU Online Mabinogi Maestia: Rise of Keledus MagiKnights Magic Barrage Magic World Online Manga Fighter MapleStory Martial Heroes Marvel Heroes Marvel Super Hero Squad Online Marvel: Avengers Alliance Mass Effect 4 MechWarrior Online Megaten Meridian 59 : Evolution Merlin MetalMercs Metaplace Metin 2 MicroVolts Middle-earth: Shadow of Mordor Midkemia Online Might & Magic Heroes: Kingdoms Might & Magic X: Legacy MilMo Minecraft Mini Fighter Minions of Mirth Ministry of War Monato Esprit Monkey King Online Monkey Quest Monster & Me Monster Madness Online MonsterMMORPG Moonlight Online: Tales of Eternal Blood Moonrise Mordavia Mortal Online Mourning My Lands Myst Online: URU Live Myth Angels Online Myth War Myth War 2 Mythborne Mytheon Mythic Saga Mythos N.E.O Online NIDA Online Nadirim Naviage: The Power of Capital Navy Field Need for Speed World Nemexia Neo's Land NeoSteam Neocron Nether Neverwinter Nexus: The Kingdom Of The Winds NinjaTrick NosTale Novus Aeterno Oberin Odin Quest Odyssey RPG Ogre Island Omerta 3 Online Boxing Manager Onverse Oort Online Order & Chaos Online Order of Magic Original Blood Origins Return Origins of Malu Orion's Belt Otherland Forums OverSoul Overkings Overwatch Oz Online Oz World Pandora Saga Pantheon: Rise of the Fallen Panzar Parabellum Parallel Kingdom Parfait Station Path of Exile Pathfinder Online Perfect World Perpetuum Online Persona V Phantasy Star Online 2 Phantasy Star Universe Phoenix Dynasty Online Phylon Pi Story Picaroon Pillars of Eternity Pirate Galaxy Pirate Storm Pirate101 PirateKing Online Pirates of the Burning Sea Pirates of the Caribbean Online Pixie Hollow Planeshift Planet Arkadia Planet Calypso PlanetSide 2 Planetside Planets³ Playboy Manager Pocket Legends Pockie Ninja Pockie Pirates Pockie Saints Pokémon X and Y PoxNora Prime World Prime: Battle for Dominus Priston Tale Priston Tale II Prius Online Prodigy Project Blackout Project Gorgon Project Powder Project Titan Forums Project Wiki Project Zomboid Puzzle Pirates Quest for Infamy Quickhit Football
R-S
R2 Online RAN Online RF Online ROSE Online Rage of 3 Kingdoms Ragnarok Online Ragnarok Online II RaiderZ Rail Nation Rakion Rappelz RappelzSEA Ravenmarch Realm Fighter Realm of Sierra Realm of the Mad God Realm of the Titans Realms Online Rebel Galaxy Reclamation Red Stone Red War: Edem's Curse Regnum Online Remnant Knights Renaissance Repulse Requiem: Memento Mori Rift RiotZone Rise Rise of Dragonian Era Rise of Empire Rise of the Tycoon Risen 3: Titan Lords Rising of King Risk Your Life Rivality Rockfree Rohan: Blood Feud Role Play Worlds Roll n Rock Roma Victor Romadoria Rosh Online Roto X Rubies of Eventide Ruin Online Rumble Fighter Runes of Magic Runescape Rust Rusty Hearts Ryzom S4 League SAGA SD Gundam Capsule Fighter Online SMITE SUN Sacred 3 Sagramore Salem SaySayGirls Scarlet Blade Scions of Fate Seal Online: Evolution Second Chance Heroes Second Life Secret of the Solstice Seed Serenia Fantasy Seven Seas Saga Seven Souls Online Sevencore Shadow Realms Shadow of Legend Shadowbane Shadowgate Shadowrun Online Shaiya Shards Online Shattered Galaxy Sho Online Shot Online Shroud of the Avatar SideQuest Siege on Stars Sigonyth: Desert Eternity Silkroad Online Skyblade Skyforge SmashMuck Champions Smoo Online Soldier Front Soul Master Soul Order Online Soul of Guardian South Park: The Stick of Truth Space Heroes Universe Sparta: War of Empires Spellcasters Sphere Spiral Knights Spirit Tales Splash Fighters Squad Wars Star Citizen Star Conflict Star Sonata 2 Star Stable Star Supremacy Star Trek Online Star Trek: Infinite Space Star Wars Galaxies Star Wars: Clone Wars Adventures Star Wars: The Old Republic StarQuest Online Starbound Stargate Worlds Starlight Story Starpires State of Decay SteelWar Online Stone Age 2 Stormfall: Age of War Stormthrone Storybricks Stronghold Kingdoms Styx: Master of Shadows Sudden Attack Supremacy 1914 Supreme Destiny Sword Girls Sword of Destiny: Rise of Aions SwordX Swords of Heavens Swordsman
T-Z
TERA TS Online TUG Tabula Rasa Tactica Online Tales Runner Tales of Fantasy Tales of Pirates Tales of Pirates II Tales of Solaris Talisman Online Tamer Saga Tank Ace Tantra Online Tatsumaki: Land at War Terra Militaris TerraWorld Online Terraria Thang Online The 4th Coming The Agency The Aurora World The Banner Saga The Black Watchmen The Chronicle The Chronicles of Spellborn The Crew The Division The Epic Might The Hammers End The Incredible Adventures of Van Helsing The Incredible Adventures of Van Helsing 2 The Legend of Ares The Lost Titans The Matrix Online The Mighty Quest for Epic Loot The Missing Ink The Mummy Online The Myth of Soma The Pride of Taern The Realm Online The Repopulation The Secret World The Sims Online The Strategems The West The Witcher 3: Wild Hunt Theralon There Therian Saga Thrones of Chaos Tibia Tibia Micro Edition Tiger Knight Titan Siege Titans of Time Toontown Online Top Speed Topia Online Torchlight Torment: Tides of Numenera Total Domination Transformers Universe Transistor Transverse Traveller AR Travia Online Travian Triad Wars Trials of Ascension Tribal Hero Tribal Wars Tribes Universe Trickster Online Trove Troy Online True Fantasy Live Online Turf Battles Twelve Sky Twelve Sky 2 Twilight War Tynon U.B. Funkeys UFO Online URDEAD Online Ultima Forever: Quest for the Avatar Ultima Online Ultima X: Odyssey Ultimate Naruto Ultimate Soccer Boss Uncharted Waters Online Undercover 2: Merc Wars Underlight Unification Wars Universe Online Utopia Valkyrie Sky Vampire Lord Online Vanguard: Saga of Heroes Vanquish Space Vector City Racers Vendetta Online Victory - Age of Racing Vindictus Virtonomics Vis Gladius Visions of Zosimos VoidExpanse Voyage Century Online W.E.L.L. Online WAR (Warhammer Online) WAR2 Glory WYD Global Wakfu War Thunder War of 2012 War of Angels War of Legends War of Mercenaries War of Thrones War of the Immortals WarFlow Waren Story Warflare Wargame1942 Warhammer 40,000: Eternal Crusade Warhammer 40K: Dark Millennium Online Warhammer Online: Wrath of Heroes Warkeepers Warrior Epic Wartune Wasteland 2 WebLords Wild West Online WildStar Wind of Luck WindSlayer 2 Wings of Destiny Wish Wizard101 Wizardry Online Wizards and Champions Wonder King Wonderland Online World Golf Tour World of Battles World of Darkness World of Heroes World of Kung Fu World of Pirates World of Speed World of Tanks World of Tanks Generals World of Warcraft World of Warplanes World of Warriors World of Warships World of the Living Dead WorldAlpha Wurm Online Xenoblade Chronicles: X Xenocell Xiah Xsyon Xulu YS Online Yitien ZU Online Zentia Zero Online Zero Online: The Andromeda Crisis Zodiac Online Zombies Ate My Pizza eRepublik

MMORPG.com Discussion Forums

General Discussion

General Discussion 

Hardware  » NVIDIA @ CES

2 Pages « 1 2 Search
37 posts found
  Quizzical

Guide

Joined: 12/11/08
Posts: 13774

1/08/13 3:15:44 PM#21
Originally posted by Ridelynn

 


Originally posted by Quizzical

Originally posted by lizardbones  

Originally posted by Ridelynn The idea is that you have 24 GPUs in a chassis. These are presumably SLI'ed in some fashion (similarly to how they are doing it for high performance applications). And with a nice backplane, there's no reason you can take multiple chassis and have them talk to each other in a single cabinet. And multiple cabinets talk to each other... You end up with Something like this. As a gamer, your game may be running on 10 GPUs, or it may run on a single shared GPU. It will use as many GPU cycles as whomever the owner of the Grid allows it to, but the Grid device is designed to just be a scalable computing resource. It's not necessarily one GPU per client. It's "A big mess of GPU computer power" provided for a "big mess of customer-driver graphics workloads".
It seems to work the same way that virtual machines work in a business environment. The system just pushes power around to wherever it's needed, regardless of how many or how few gpu cycles it requires. Except instead of cpu or memory, it does this with the GPU, which if I understand correctly hasn't been done before. You can stack the devices together, but I don't know if they all work together as one large unit. You might have 35 players per box, or 700 players per 20 boxes. Dunno.
In order for a supercomputer to be useful, you have to have an application that readily splits into an enormous number of threads that don't need to communicate with each other all that much.  That's basically the opposite of what a GPU does with games.

 

Not necessarily. Think about it - internal to a GPU there are hundreds of compute units - rather you call them Shader Processors or CUDA Cores or Stream Units what have you. The GPU work load is split across these extremely paralleled.

The problem with SLI/CFX (and presumably Grid) isn't that the work doesn't split across an enormous number of threads: We already know that a typical GPU workload does that extremely well already. it's been mainly in software implementation. We've seen in properly packaged software, with driver support, with efficiencies that approach 100% for dual GPU setups (I'll admit they taper off greatly after that in Desktops, but the point is we can get scaling to work across multiple GPUs, maybe Grid doesn't use PCI3 or has some other means than the SLI bridge to accomplish better scaling - I don't know). The main problem is that it hasn't been generically that great - you can't just throw multiple GPU's and expect it to work, the software has to incorporate it to some degree, and the drivers have to be tweaked just so. And there's some loss of efficiency in making it convenient - using PCI busses with only a simple bridge connection to make it easy to install, and making the drivers compromising enough to allow for slight hardware mismatches and timing inaccuracies.

With virtualization software, it would be very easy to genericize any number of discrete computing assests, and then re-allocate them as needed. Amazon does this with general computing power and Elastic Cloud Compute. It's not really any different, just the driver has to work a bit differently than it does on a typical Windows platform - and since nVidia is making all the hardware, and writes the driver, it wouldn't be very hard for them at all.

It's definitely not meant for consumer gaming - and I agree to some extent, it's intended for enterprise-level applications. It's basically nVidia taking their recent supercomputer experience, and trying to play a video game on it. We may see it used in baby super computers (for like universities), or render farms (they already use similar technology). OnLive is the obvious assumption, but I think that model too far ahead of it's time - the internet superstructure can't keep up with that. It obviously isn't for home use. But we may see something new come from it... it has some obvious implications with the Shield, and there could be more to that story that just hasn't been told yet.

In order for a GPU-friendly workload (which requires massive amounts of SIMD and little to no branching, among other things) to scale well to multiple GPUs, it's not enough for it to merely be trivial to break into an enormous number of threads.  It's also critical that those threads not need to communicate with each other very much.  Graphics computations trivially break into many thousands of threads if you need them to, but those threads have to communicate to an enormous degree.

Let's dig into how the OpenGL pipeline works; DirectX is similar.  You start with a vertex shader.  This usually takes data from a vertex array object in video memory, though it can be streamed in from the CPU.  This reads in some vertex data and some uniforms, does some computations, produces some data, and outputs it.  One vertex shader doesn't have to talk to another vertex shader directly, but in order to properly vectorize the data, GPUs will probably tend to run the same vertex shader on a bunch of different vertices at the same time, since they all apply the same instructions in the same order, but merely to different starting data.  (That's what SIMD, or Single Instruction Multiple Data means.)

Then you take outputs from vertex shaders and have to look at which patches (if you think of them as triangles, that's close enough) contain which vertices.  Each invocation of a tessellation control shader has to read in the vertex shader outputs from each of the vertices that the patch contains.  That means a tessellation control shader input typically corresponds to three or so sets of vertex shader outputs.  It doesn't have to be three; it can be any arbitrary number, but it will usually be three if you do tessellation in what I think is the obvious, straightforward manner; it's certainly the geometrically intuitive manner.

Meanwhile, the same vertex is likely to be part of several different patches, so an average set of vertex shader outputs needs to get read in as input by several different tessellation control shaders.  For what I'm working on, the average number of tessellation control shader invocations that will read in a given vertex shader output is around 3-4, though it varies by which program you're using.  And it's not a simple case where the outputs corresponding to these three vertex shaders get read in inputs corresponding to those three patches.  Which vertices correspond to which patches can be arbitrary, and it's a many to many relationship.

Then the tessellation control shader does a little bit of work before handing its outputs off to the hardware tessellator.  In some video cards, such as my Radeon HD 5850, there is only one hardware tessellator in the entire chip.  It doesn't matter which patch the data came from, or which surface, or even which program; it all has to go to the same hardware tessellator.  Some cards have multiple tessellators, but I think that's more for latency or bandwidth reasons than raw tessellation performance.

So then the tessellator does its thing and creates a bunch of new vertices and a bunch of new data on how the vertices are adjacent.  A single patch sent to the tessellator can easily produce hundreds of vertices and hundreds of triangles, though it will typically produce far fewer.  (The OpenGL 4.2 specification requires hardware to support any tessellation degree up to at least 64, though hardware can optionally support higher than that.  Setting all tessellation degrees to 64 for triangles means each patch will correspond to 2977 vertices and 6144 triangles.)

Then the vertex data that was output from the tessellation control shader gets input into tessellation evaluation shaders.  All of the data that was output for a given patch must be available as inputs for the next patch, and this includes outputs for each vertex of the patch.  Additionally, the hardware tessellator provides for each vertex barycentric coordinates to describe where within a patch each particular vertex that it outputs is.  This time, while each vertex in a tessellation evaluation shader corresponds to only one patch in a tessellation control shader, a single patch can correspond to many vertices.

Then the tessellation control shader does its thing, and outputs a bunch of data for geometry shaders.  The way data gets passed from tessellation evaluation shaders to geometry shaders is a lot like the way it gets passed from vertex shaders to tessellation control shaders:  each output will usually be input into several invocations of the next stage, and each input gathers data from several outputs from the previous stage.

Then the geometry shader does whatever it does and outputs data for each triangle that is to be drawn.  This goes to the rasterizer that figures out which pixels on the screen correspond to a particular triangle.  It produces a fragment for each such pixel, and then sends the fragments on to fragment shaders.  There can be many fragments corresponding to the same triangle.  In extreme cases, a single triangle can correspond to millions of pixels.  While that's unusual, getting thousands of fragments from a single triangle is not.

Meanwhile, the rasterizer has to take the data output from three separate vertices of a triangle and produce data at the particular point in the triangle corresponding to the pixel on the screen for each fragment.  It gives you a few choices on how data will be scaled, but it usually interpolates between the vertices, so that a given triangle from a geometry shader can correspond to many different fragments, all of which get different input data.

Then the fragment shaders do their thing, and output the color of a fragment and possibly also the depth.  Then they take the depth that is output and check to see whether there is already some other fragment that has been drawn for that pixel in "front" of it.  If so, then the fragment that was computed is discarded as being "behind" something else in the scene.  If not, then the color and depth output from a fragment shader get written to that particular pixel for the framebuffer and depth buffer, respectively.

Here, every fragment in every fragment shader has to use exactly the same framebuffer and depth buffer.  It doesn't matter if they're part of the same triangle, or generated by the same API drawing call, or even generated by the same program.

So as you can see, there's a tremendous amount of communication between threads.  Passing data from one stage of the pipeline to the next is rarely a one to one relationship.  Sometimes it's one to many or many to one, or even many to many.

Enormous amounts of data have to be passed around.  From a quick calculation, it's not at all obvious whether the amount of non-texture data that is input into my fragment shaders alone exceeds the video memory bandwidth of my entire video card.  And that's ignoring outputs, textures, and all of the other pipeline stages, not to mention the work that is actually done within the fragment shader.  It does count uniforms separately for each fragment shader invocation, and that's about half of the input data.

But if you're having to pass around too much data from one stage to the next for even a GPU's own dedicated video memory to have enough bandwidth, then passing that sort of data from one GPU to another is completely out of the question.

You can readily have different GPUs rendering different frames simultaneously.  That way, a GPU computes an entire frame on its own, and then only has to send the completed framebuffer somewhere else.  There is little dependence between the computations in one frame and those in the next, unless you neglect to wipe the framebuffer and depth buffer.

If you're trying to render a movie, can work on a bunch of frames simultaneously, and don't care if each frame takes an hour to render, then this could work.  Though if that's your goal, I'd question why you need the "36 users for a single card" functionality.

But spreading that across several GPUs simply isn't useful for rendering games, due to latency issues.  1000 GPUs each rendering one frame per second would get you an amazing 1000 frames per second--and leave the game completely unplayable because all you can see is the state of the game as it was a full second ago.

  Ridelynn

Elite Member

Joined: 12/19/10
Posts: 3656

1/08/13 11:26:55 PM#22

Your assuming that the GPU is a complete video card-like unit, like on a PC - with it's own power supply and memory on a card, and cross-talk between GPU's being forced to mirror the memory and push information over a PCI bus.

What's to stop Grid from just having a lot of GPU's, but a common VRAM bank, or (much) higher bandwidth communication lanes.

It just says 24 GPU's per 4U rack unit. Nothing says these are 24 "video cards", or even how much VRAM they have access to or what the communication infrastructure is. If you look at a picture of a rack (there are 24 GPU's per rack, up to 20 racks per cabinet)... you can't even identify the discrete GPU units - it just looks like 4 hard drive cages and a big mess of fans running down the center of the rack (see link below for picture).

Looking at nVidia's page - they include a hypervisor to "virtualize" the driver for software access, acting as a ringmaster for the hardware. It's set up almost identically to their high performance units (Titan Supercomputer is the latest example).

http://www.nvidia.com/object/cloud-gaming-benefits.html

No big surprises. I think it will work out fairly well as far as actual rendering performance. I just remain skeptical of the network latency - which has always been the real concern with "cloud gaming".

  lkc673

Novice Member

Joined: 4/08/10
Posts: 150

1/09/13 12:10:32 AM#23

Weird that companies are busting out handheld console suddendly. They must have done some good research to see that there is a market for handheld with casual game. Im not sure if i would fork out big bucks for high hand handheld though. With my games i would want to mod or do some creative stuff doubt these handheld are able to as of yet. 

Guess microsoft is the only one without a handheld device now!!! lets see if they will soon!

  Quizzical

Guide

Joined: 12/11/08
Posts: 13774

1/09/13 12:13:08 AM#24
Originally posted by Ridelynn

Your assuming that the GPU is a complete video card-like unit, like on a PC - with it's own power supply and memory on a card, and cross-talk between GPU's being forced to mirror the memory and push information over a PCI bus.

What's to stop Grid from just having a lot of GPU's, but a common VRAM bank, or (much) higher bandwidth communication lanes.

It just says 24 GPU's per 4U rack unit. Nothing says these are 24 "video cards", or even how much VRAM they have access to or what the communication infrastructure is. If you look at a picture of a rack (there are 24 GPU's per rack, up to 20 racks per cabinet)... you can't even identify the discrete GPU units - it just looks like 4 hard drive cages and a big mess of fans running down the center of the rack (see link below for picture).

Looking at nVidia's page - they include a hypervisor to "virtualize" the driver for software access, acting as a ringmaster for the hardware. It's set up almost identically to their high performance units (Titan Supercomputer is the latest example).

http://www.nvidia.com/object/cloud-gaming-benefits.html

No big surprises. I think it will work out fairly well as far as actual rendering performance. I just remain skeptical of the network latency - which has always been the real concern with "cloud gaming".

Nvidia tells you the number of discrete GPU chips on a card:

http://www.nvidia.com/object/grid-boards.html

It's four on a Grid K1 and two on a Grid K2.

-----

Getting enough bandwidth to feed everything is one of the key problems restricting GPU performance.  (Power consumption is the other big one.)  Indeed, the impracticality of getting enough memory bandwidth for it is the only thing stopping AMD from releasing integrated graphics that handily beats $100 discrete video cards.  AMD and Nvidia didn't move to expensive GDDR5 memory rather than cheap DDR3 just for fun.  They did it because there's no other practical way to get the video memory bandwidth needed for a higher end gaming card.

Meanwhile, look at how the memory is laid out on a video card.  They put the GPU chip in the middle of a ring of memory chips, with a ton of traces all over the place connecting the memory to the GPU.  Getting enough bandwidth to a single higher end GPU that doesn't have to communicate with other GPUs at all is hard.

And now you're surmising that Nvidia will magically deliver that kind of bandwidth spread evenly between two GPU chips at once?  And with ridiculous amounts of bandwidth connecting the two GPU chips?  If they could do that, then cloud rendering of graphics is way down the list of obvious applications for it.  For starters, how about SLI and CrossFire that actually work perfectly 100% of the time?

  Quizzical

Guide

Joined: 12/11/08
Posts: 13774

1/09/13 12:16:48 AM#25
Originally posted by lkc673

Weird that companies are busting out handheld console suddendly. They must have done some good research to see that there is a market for handheld with casual game. Im not sure if i would fork out big bucks for high hand handheld though. With my games i would want to mod or do some creative stuff doubt these handheld are able to as of yet. 

Guess microsoft is the only one without a handheld device now!!! lets see if they will soon!

There's more likely to be a market after the hardware to do it decently is available (i.e., when the new generation of chips such as Tegra 4 comes to market) than before.  Will there be a strong market for it?  We'll find out.

  lizardbones

Hard Core Member

Joined: 6/11/08
Posts: 10953

I think with my heart and move with my head.-Kongos

1/09/13 11:21:17 AM#26


Originally posted by Quizzical

Originally posted by Ridelynn  

Originally posted by Quizzical

Originally posted by lizardbones  

Originally posted by Ridelynn The idea is that you have 24 GPUs in a chassis. These are presumably SLI'ed in some fashion (similarly to how they are doing it for high performance applications). And with a nice backplane, there's no reason you can take multiple chassis and have them talk to each other in a single cabinet. And multiple cabinets talk to each other... You end up with Something like this. As a gamer, your game may be running on 10 GPUs, or it may run on a single shared GPU. It will use as many GPU cycles as whomever the owner of the Grid allows it to, but the Grid device is designed to just be a scalable computing resource. It's not necessarily one GPU per client. It's "A big mess of GPU computer power" provided for a "big mess of customer-driver graphics workloads".
It seems to work the same way that virtual machines work in a business environment. The system just pushes power around to wherever it's needed, regardless of how many or how few gpu cycles it requires. Except instead of cpu or memory, it does this with the GPU, which if I understand correctly hasn't been done before. You can stack the devices together, but I don't know if they all work together as one large unit. You might have 35 players per box, or 700 players per 20 boxes. Dunno.
In order for a supercomputer to be useful, you have to have an application that readily splits into an enormous number of threads that don't need to communicate with each other all that much.  That's basically the opposite of what a GPU does with games.
  Not necessarily. Think about it - internal to a GPU there are hundreds of compute units - rather you call them Shader Processors or CUDA Cores or Stream Units what have you. The GPU work load is split across these extremely paralleled. The problem with SLI/CFX (and presumably Grid) isn't that the work doesn't split across an enormous number of threads: We already know that a typical GPU workload does that extremely well already. it's been mainly in software implementation. We've seen in properly packaged software, with driver support, with efficiencies that approach 100% for dual GPU setups (I'll admit they taper off greatly after that in Desktops, but the point is we can get scaling to work across multiple GPUs, maybe Grid doesn't use PCI3 or has some other means than the SLI bridge to accomplish better scaling - I don't know). The main problem is that it hasn't been generically that great - you can't just throw multiple GPU's and expect it to work, the software has to incorporate it to some degree, and the drivers have to be tweaked just so. And there's some loss of efficiency in making it convenient - using PCI busses with only a simple bridge connection to make it easy to install, and making the drivers compromising enough to allow for slight hardware mismatches and timing inaccuracies. With virtualization software, it would be very easy to genericize any number of discrete computing assests, and then re-allocate them as needed. Amazon does this with general computing power and Elastic Cloud Compute. It's not really any different, just the driver has to work a bit differently than it does on a typical Windows platform - and since nVidia is making all the hardware, and writes the driver, it wouldn't be very hard for them at all. It's definitely not meant for consumer gaming - and I agree to some extent, it's intended for enterprise-level applications. It's basically nVidia taking their recent supercomputer experience, and trying to play a video game on it. We may see it used in baby super computers (for like universities), or render farms (they already use similar technology). OnLive is the obvious assumption, but I think that model too far ahead of it's time - the internet superstructure can't keep up with that. It obviously isn't for home use. But we may see something new come from it... it has some obvious implications with the Shield, and there could be more to that story that just hasn't been told yet.
In order for a GPU-friendly workload (which requires massive amounts of SIMD and little to no branching, among other things) to scale well to multiple GPUs, it's not enough for it to merely be trivial to break into an enormous number of threads.  It's also critical that those threads not need to communicate with each other very much.  Graphics computations trivially break into many thousands of threads if you need them to, but those threads have to communicate to an enormous degree.

Let's dig into how the OpenGL pipeline works; DirectX is similar.  You start with a vertex shader.  This usually takes data from a vertex array object in video memory, though it can be streamed in from the CPU.  This reads in some vertex data and some uniforms, does some computations, produces some data, and outputs it.  One vertex shader doesn't have to talk to another vertex shader directly, but in order to properly vectorize the data, GPUs will probably tend to run the same vertex shader on a bunch of different vertices at the same time, since they all apply the same instructions in the same order, but merely to different starting data.  (That's what SIMD, or Single Instruction Multiple Data means.)

Then you take outputs from vertex shaders and have to look at which patches (if you think of them as triangles, that's close enough) contain which vertices.  Each invocation of a tessellation control shader has to read in the vertex shader outputs from each of the vertices that the patch contains.  That means a tessellation control shader input typically corresponds to three or so sets of vertex shader outputs.  It doesn't have to be three; it can be any arbitrary number, but it will usually be three if you do tessellation in what I think is the obvious, straightforward manner; it's certainly the geometrically intuitive manner.

Meanwhile, the same vertex is likely to be part of several different patches, so an average set of vertex shader outputs needs to get read in as input by several different tessellation control shaders.  For what I'm working on, the average number of tessellation control shader invocations that will read in a given vertex shader output is around 3-4, though it varies by which program you're using.  And it's not a simple case where the outputs corresponding to these three vertex shaders get read in inputs corresponding to those three patches.  Which vertices correspond to which patches can be arbitrary, and it's a many to many relationship.

Then the tessellation control shader does a little bit of work before handing its outputs off to the hardware tessellator.  In some video cards, such as my Radeon HD 5850, there is only one hardware tessellator in the entire chip.  It doesn't matter which patch the data came from, or which surface, or even which program; it all has to go to the same hardware tessellator.  Some cards have multiple tessellators, but I think that's more for latency or bandwidth reasons than raw tessellation performance.

So then the tessellator does its thing and creates a bunch of new vertices and a bunch of new data on how the vertices are adjacent.  A single patch sent to the tessellator can easily produce hundreds of vertices and hundreds of triangles, though it will typically produce far fewer.  (The OpenGL 4.2 specification requires hardware to support any tessellation degree up to at least 64, though hardware can optionally support higher than that.  Setting all tessellation degrees to 64 for triangles means each patch will correspond to 2977 vertices and 6144 triangles.)

Then the vertex data that was output from the tessellation control shader gets input into tessellation evaluation shaders.  All of the data that was output for a given patch must be available as inputs for the next patch, and this includes outputs for each vertex of the patch.  Additionally, the hardware tessellator provides for each vertex barycentric coordinates to describe where within a patch each particular vertex that it outputs is.  This time, while each vertex in a tessellation evaluation shader corresponds to only one patch in a tessellation control shader, a single patch can correspond to many vertices.

Then the tessellation control shader does its thing, and outputs a bunch of data for geometry shaders.  The way data gets passed from tessellation evaluation shaders to geometry shaders is a lot like the way it gets passed from vertex shaders to tessellation control shaders:  each output will usually be input into several invocations of the next stage, and each input gathers data from several outputs from the previous stage.

Then the geometry shader does whatever it does and outputs data for each triangle that is to be drawn.  This goes to the rasterizer that figures out which pixels on the screen correspond to a particular triangle.  It produces a fragment for each such pixel, and then sends the fragments on to fragment shaders.  There can be many fragments corresponding to the same triangle.  In extreme cases, a single triangle can correspond to millions of pixels.  While that's unusual, getting thousands of fragments from a single triangle is not.

Meanwhile, the rasterizer has to take the data output from three separate vertices of a triangle and produce data at the particular point in the triangle corresponding to the pixel on the screen for each fragment.  It gives you a few choices on how data will be scaled, but it usually interpolates between the vertices, so that a given triangle from a geometry shader can correspond to many different fragments, all of which get different input data.

Then the fragment shaders do their thing, and output the color of a fragment and possibly also the depth.  Then they take the depth that is output and check to see whether there is already some other fragment that has been drawn for that pixel in "front" of it.  If so, then the fragment that was computed is discarded as being "behind" something else in the scene.  If not, then the color and depth output from a fragment shader get written to that particular pixel for the framebuffer and depth buffer, respectively.

Here, every fragment in every fragment shader has to use exactly the same framebuffer and depth buffer.  It doesn't matter if they're part of the same triangle, or generated by the same API drawing call, or even generated by the same program.

So as you can see, there's a tremendous amount of communication between threads.  Passing data from one stage of the pipeline to the next is rarely a one to one relationship.  Sometimes it's one to many or many to one, or even many to many.

Enormous amounts of data have to be passed around.  From a quick calculation, it's not at all obvious whether the amount of non-texture data that is input into my fragment shaders alone exceeds the video memory bandwidth of my entire video card.  And that's ignoring outputs, textures, and all of the other pipeline stages, not to mention the work that is actually done within the fragment shader.  It does count uniforms separately for each fragment shader invocation, and that's about half of the input data.

But if you're having to pass around too much data from one stage to the next for even a GPU's own dedicated video memory to have enough bandwidth, then passing that sort of data from one GPU to another is completely out of the question.

You can readily have different GPUs rendering different frames simultaneously.  That way, a GPU computes an entire frame on its own, and then only has to send the completed framebuffer somewhere else.  There is little dependence between the computations in one frame and those in the next, unless you neglect to wipe the framebuffer and depth buffer.

If you're trying to render a movie, can work on a bunch of frames simultaneously, and don't care if each frame takes an hour to render, then this could work.  Though if that's your goal, I'd question why you need the "36 users for a single card" functionality.

But spreading that across several GPUs simply isn't useful for rendering games, due to latency issues.  1000 GPUs each rendering one frame per second would get you an amazing 1000 frames per second--and leave the game completely unplayable because all you can see is the state of the game as it was a full second ago.




Sometimes I think you are just trying to punch people with words. I'm going to leave all those words in there, even though anyone else who gets through them will have been punched in the brain.

Wouldn't it make a lot more sense to have more than one player's game per GPU than to have multiple GPUs working on one player's game? That seems to be what they are attempting to do. Most games do not utilize 100% of a GPUs processing, so being able to capitalize on that would lead to a hardware and power savings, even if it doesn't lead to a huge performance improvement. I'm pretty sure their incredibly boring presentation talked about a power and cost savings, not a huge performance increase.

Another thing is they are saying what they have doesn't exist anywhere else. Using multiple GPUs per game already exists with the SLI stuff. Multiple people utilizing a single GPU doesn't exist outside of their hardware. It's virtualized hardware for the GPU. I've not even looked at this type of thing for servers, but when running virtual machines, the virtual GPU hardware is bad. Bad, bad, bad, bad, bad. If they can work out a system where virtual GPUs are** garbage, they would have a new product. Much better than an old product that is being used in a suspicious way.

Whether or not they can do it is another question entirely. You seem to have questions about whether even one of their GPUs will give decent performance, much less multiple player's games utilizing a single GPU.

** are not. This should say "are not".

I can not remember winning or losing a single debate on the internet.

  Quizzical

Guide

Joined: 12/11/08
Posts: 13774

1/09/13 11:42:41 AM#27
Originally posted by lizardbones

Sometimes I think you are just trying to punch people with words. I'm going to leave all those words in there, even though anyone else who gets through them will have been punched in the brain.

Wouldn't it make a lot more sense to have more than one player's game per GPU than to have multiple GPUs working on one player's game? That seems to be what they are attempting to do. Most games do not utilize 100% of a GPUs processing, so being able to capitalize on that would lead to a hardware and power savings, even if it doesn't lead to a huge performance improvement. I'm pretty sure their incredibly boring presentation talked about a power and cost savings, not a huge performance increase.

Another thing is they are saying what they have doesn't exist anywhere else. Using multiple GPUs per game already exists with the SLI stuff. Multiple people utilizing a single GPU doesn't exist outside of their hardware. It's virtualized hardware for the GPU. I've not even looked at this type of thing for servers, but when running virtual machines, the virtual GPU hardware is bad. Bad, bad, bad, bad, bad. If they can work out a system where virtual GPUs are garbage, they would have a new product. Much better than an old product that is being used in a suspicious way.

Whether or not they can do it is another question entirely. You seem to have questions about whether even one of their GPUs will give decent performance, much less multiple player's games utilizing a single GPU.

 

Sure, multiple things running on a single GPU simultaneously makes sense.  A Radeon HD 6970 could do two things at once, and at least some GCN cards can probably do more, so AMD is heading in this direction, too.  GPU virtualization in that way makes sense for some people.

SLI and CrossFire use multiple GPUs to render a game, but they use alternate frame rendering.  Splitting a single frame across multiple GPUs simply isn't practical.  It's technically possible, but it would bring such a large performance hit that there's no point.  Well, maybe at ultra-high resolutions, if a game engine was aware that there were two GPUs and did a bunch of custom stuff on the CPU to balance the load between them, it could kind of work as an alternative to CrossFire or SLI.  But that really can't be done purely in video drivers or tacked onto existing games without making some very fundamental changes to the game engine.

-----

One other thought on why they surely aren't planning on splitting a single frame among multiple GPUs:  if they were going to do that, then why not use a higher end GPU instead?  The GPUs in a Grid K1 are basically half of a GK107.  The bandwidth you'd need to make that perform comparably for a single frame might be possible, but it would be so outlandishly expensive that it would be completely stupid to do it that way instead of just using a GK107 and making the entire inter-GPU communication problem vanish.

  Dilweed

Advanced Member

Joined: 8/17/04
Posts: 224

1/09/13 12:01:39 PM#28

Dudes, I've been a member of this forum for almost 10 years.

I'm reading this topic, I'm kinda hardware savy myself, I'm halfway and I want to congratulate you guys with this thread +1

Most informative thread (I prolly missed some) I have ever seen on this forum, keep it up, thanks :)

  lizardbones

Hard Core Member

Joined: 6/11/08
Posts: 10953

I think with my heart and move with my head.-Kongos

1/09/13 12:10:58 PM#29


Originally posted by Quizzical

Originally posted by lizardbones Sometimes I think you are just trying to punch people with words. I'm going to leave all those words in there, even though anyone else who gets through them will have been punched in the brain. Wouldn't it make a lot more sense to have more than one player's game per GPU than to have multiple GPUs working on one player's game? That seems to be what they are attempting to do. Most games do not utilize 100% of a GPUs processing, so being able to capitalize on that would lead to a hardware and power savings, even if it doesn't lead to a huge performance improvement. I'm pretty sure their incredibly boring presentation talked about a power and cost savings, not a huge performance increase. Another thing is they are saying what they have doesn't exist anywhere else. Using multiple GPUs per game already exists with the SLI stuff. Multiple people utilizing a single GPU doesn't exist outside of their hardware. It's virtualized hardware for the GPU. I've not even looked at this type of thing for servers, but when running virtual machines, the virtual GPU hardware is bad. Bad, bad, bad, bad, bad. If they can work out a system where virtual GPUs are garbage, they would have a new product. Much better than an old product that is being used in a suspicious way. Whether or not they can do it is another question entirely. You seem to have questions about whether even one of their GPUs will give decent performance, much less multiple player's games utilizing a single GPU.  
Sure, multiple things running on a single GPU simultaneously makes sense.  A Radeon HD 6970 could do two things at once, and at least some GCN cards can probably do more, so AMD is heading in this direction, too.  GPU virtualization in that way makes sense for some people.

SLI and CrossFire use multiple GPUs to render a game, but they use alternate frame rendering.  Splitting a single frame across multiple GPUs simply isn't practical.  It's technically possible, but it would bring such a large performance hit that there's no point.  Well, maybe at ultra-high resolutions, if a game engine was aware that there were two GPUs and did a bunch of custom stuff on the CPU to balance the load between them, it could kind of work as an alternative to CrossFire or SLI.  But that really can't be done purely in video drivers or tacked onto existing games without making some very fundamental changes to the game engine.

-----

One other thought on why they surely aren't planning on splitting a single frame among multiple GPUs:  if they were going to do that, then why not use a higher end GPU instead?  The GPUs in a Grid K1 are basically half of a GK107.  The bandwidth you'd need to make that perform comparably for a single frame might be possible, but it would be so outlandishly expensive that it would be completely stupid to do it that way instead of just using a GK107 and making the entire inter-GPU communication problem vanish.




The best case scenario sounds like the system has to render a frame, and picks the next available GPU to render that frame. The frames are rendered quickly because there's always another physical GPU to render the next frame. Physical GPUs aren't assigned to any specific game or user, only the next available frame. The game, however, sees one virtual GPU and the game doesn't have to worry about it too much or be rewritten to function. They are doing the usual rendering, just very efficiently.

I think the big deal is virtualizing the GPUs.

If they can successfully virtualize GPUs so that the virtual GPU seen by a game isn't complete garbage, people could boot up their Linux system, and then start a virtual Windows machine to run games, and it would actually work. I'll admit, that's a big stretch since they've targeted servers, but that's one of the unintended results I'd like to see.

I can not remember winning or losing a single debate on the internet.

  Quizzical

Guide

Joined: 12/11/08
Posts: 13774

1/09/13 12:24:13 PM#30
Originally posted by lizardbones

 


Originally posted by Quizzical

Originally posted by lizardbones Sometimes I think you are just trying to punch people with words. I'm going to leave all those words in there, even though anyone else who gets through them will have been punched in the brain. Wouldn't it make a lot more sense to have more than one player's game per GPU than to have multiple GPUs working on one player's game? That seems to be what they are attempting to do. Most games do not utilize 100% of a GPUs processing, so being able to capitalize on that would lead to a hardware and power savings, even if it doesn't lead to a huge performance improvement. I'm pretty sure their incredibly boring presentation talked about a power and cost savings, not a huge performance increase. Another thing is they are saying what they have doesn't exist anywhere else. Using multiple GPUs per game already exists with the SLI stuff. Multiple people utilizing a single GPU doesn't exist outside of their hardware. It's virtualized hardware for the GPU. I've not even looked at this type of thing for servers, but when running virtual machines, the virtual GPU hardware is bad. Bad, bad, bad, bad, bad. If they can work out a system where virtual GPUs are garbage, they would have a new product. Much better than an old product that is being used in a suspicious way. Whether or not they can do it is another question entirely. You seem to have questions about whether even one of their GPUs will give decent performance, much less multiple player's games utilizing a single GPU.  
Sure, multiple things running on a single GPU simultaneously makes sense.  A Radeon HD 6970 could do two things at once, and at least some GCN cards can probably do more, so AMD is heading in this direction, too.  GPU virtualization in that way makes sense for some people.

 

SLI and CrossFire use multiple GPUs to render a game, but they use alternate frame rendering.  Splitting a single frame across multiple GPUs simply isn't practical.  It's technically possible, but it would bring such a large performance hit that there's no point.  Well, maybe at ultra-high resolutions, if a game engine was aware that there were two GPUs and did a bunch of custom stuff on the CPU to balance the load between them, it could kind of work as an alternative to CrossFire or SLI.  But that really can't be done purely in video drivers or tacked onto existing games without making some very fundamental changes to the game engine.

-----

One other thought on why they surely aren't planning on splitting a single frame among multiple GPUs:  if they were going to do that, then why not use a higher end GPU instead?  The GPUs in a Grid K1 are basically half of a GK107.  The bandwidth you'd need to make that perform comparably for a single frame might be possible, but it would be so outlandishly expensive that it would be completely stupid to do it that way instead of just using a GK107 and making the entire inter-GPU communication problem vanish.




The best case scenario sounds like the system has to render a frame, and picks the next available GPU to render that frame. The frames are rendered quickly because there's always another physical GPU to render the next frame. Physical GPUs aren't assigned to any specific game or user, only the next available frame. The game, however, sees one virtual GPU and the game doesn't have to worry about it too much or be rewritten to function. They are doing the usual rendering, just very efficiently.

I think the big deal is virtualizing the GPUs.

If they can successfully virtualize GPUs so that the virtual GPU seen by a game isn't complete garbage, people could boot up their Linux system, and then start a virtual Windows machine to run games, and it would actually work. I'll admit, that's a big stretch since they've targeted servers, but that's one of the unintended results I'd like to see.

 

Picking a GPU arbitrarily when you decide it's time to render a frame isn't practical, either, at least for gaming.  What do you do about the hundreds of MB of data buffered in one particular GPU's video memory that you need to render that particular frame?  If you pick a different GPU, then you have to copy all of the data over there.  Do that every frame and you might end up spending as much time copying buffered data around as you actually do rendering frames.  If you only switch GPUs once in a while, or if you're running an office application that doesn't need much data buffered, then it's more viable.

Booting up Linux and then running a virtual Windows machine in it to play games does strike me as the sort of thing that ought to be possible sometime soonish.  Though for that, you'd just want an ordinary GeForce or Radeon card, not a Grid card.

  botrytis

Advanced Member

Joined: 1/04/05
Posts: 2564

1/09/13 12:59:55 PM#31
Originally posted by Quizzical
Originally posted by lizardbones

 


Originally posted by Quizzical

Originally posted by lizardbones No, one box would give you 35 entry level gpus. So twenty boxes give you 700 customers, each with entry level gpu performance. This is according to them. ** edit ** So you have the equivalent of an entry level gpu per customer, just with far fewer physical gpus. On Live needed a physical gpu per customer.  
One physical Nvidia Grid card has four physical GPUs in it, and they're the lowest end discrete GPU of the generation--from either major graphics vendor.  They'll probably be clocked down from GeForce cards, and might even be paired with DDR3 memory instead of GDDR5.  That's not going to get you the performance of 35 entry level GPUs at once, unless by "entry-level", you mean something ancient like GeForce G 210 or Radeon HD 4350.


I have no idea. What they are saying is each box has 24 GPUs, which is a substantial cost improvement over having 24 discrete video cards. These are GPUs developed specifically for the GRID servers, not their regular GPUs. They've combined this with software that allows for load balancing and virtual hardware stacks(?), which means one GPU can support several users at a hardware cost reduction, and also a substantial power requirement reduction.

 

Nvidia now says that there are two such cards.

http://www.nvidia.com/object/grid-boards.html

The lower end version (Grid K1) is four GK107 chips paired with DDR3 memory in a 130 W TDP.  That means it's basically four of these, except clocked a lot lower:

http://www.newegg.com/Product/Product.aspx?Item=N82E16814130818

That's stupidly overpriced, by the way; on a price/performance basis, you could maybe justify paying $70, but not more than a far superior Radeon HD 7750 costs.  Oh, and that's before you turn the clock speeds way down to save on power consumption.

The higher end version (Grid K2) is two GK104 GPUs in a 225 W TDP.  That means basically two 4 GB GeForce GTX 680s, except clocked a lot lower, in order to have two of them on a card barely use more power than a single "real" GTX 680.

Now yes, the Grid cards might make a lot of sense for a service like OnLive.  (AMD either offers or will soon offer Trinity-based Opteron chips with integrated graphics that might also make a ton of sense for something like OnLive.)  What doesn't make sense is for customers to pay for any streaming service based on the Grid K1 cards.  But OnLive was always targeted mainly at the clueless, so nothing changes there.

They aren't custom chips.  You don't do custom chips for low-volume parts.  Nvidia doesn't even do custom chips for Quadro cards, and that's a huge cash cow.  A different bin of existing chips, yes, but that's far from doing a custom chip.  It might be a special salvage bin with something fused off that the consumer cards need.  Nvidia even explicitly says that they're Kepler GPU chips.

These boards were used in the latest supercomputer -  http://www.engadget.com/2012/11/12/titan-supercomputer-leads-latest-top-500-list-as-newly-available/  so don't say they are not good.

"In 50 years, when I talk to my grandchildren about these days, I'll make sure to mention what an accomplished MMO player I was. They are going to be so proud ..."
by Naqaj - 7/17/2013 MMORPG.com forum

  Quizzical

Guide

Joined: 12/11/08
Posts: 13774

1/09/13 1:09:29 PM#32
Originally posted by botrytis
Originally posted by Quizzical
Originally posted by lizardbones

 


Originally posted by Quizzical

Originally posted by lizardbones No, one box would give you 35 entry level gpus. So twenty boxes give you 700 customers, each with entry level gpu performance. This is according to them. ** edit ** So you have the equivalent of an entry level gpu per customer, just with far fewer physical gpus. On Live needed a physical gpu per customer.  
One physical Nvidia Grid card has four physical GPUs in it, and they're the lowest end discrete GPU of the generation--from either major graphics vendor.  They'll probably be clocked down from GeForce cards, and might even be paired with DDR3 memory instead of GDDR5.  That's not going to get you the performance of 35 entry level GPUs at once, unless by "entry-level", you mean something ancient like GeForce G 210 or Radeon HD 4350.


I have no idea. What they are saying is each box has 24 GPUs, which is a substantial cost improvement over having 24 discrete video cards. These are GPUs developed specifically for the GRID servers, not their regular GPUs. They've combined this with software that allows for load balancing and virtual hardware stacks(?), which means one GPU can support several users at a hardware cost reduction, and also a substantial power requirement reduction.

 

Nvidia now says that there are two such cards.

http://www.nvidia.com/object/grid-boards.html

The lower end version (Grid K1) is four GK107 chips paired with DDR3 memory in a 130 W TDP.  That means it's basically four of these, except clocked a lot lower:

http://www.newegg.com/Product/Product.aspx?Item=N82E16814130818

That's stupidly overpriced, by the way; on a price/performance basis, you could maybe justify paying $70, but not more than a far superior Radeon HD 7750 costs.  Oh, and that's before you turn the clock speeds way down to save on power consumption.

The higher end version (Grid K2) is two GK104 GPUs in a 225 W TDP.  That means basically two 4 GB GeForce GTX 680s, except clocked a lot lower, in order to have two of them on a card barely use more power than a single "real" GTX 680.

Now yes, the Grid cards might make a lot of sense for a service like OnLive.  (AMD either offers or will soon offer Trinity-based Opteron chips with integrated graphics that might also make a ton of sense for something like OnLive.)  What doesn't make sense is for customers to pay for any streaming service based on the Grid K1 cards.  But OnLive was always targeted mainly at the clueless, so nothing changes there.

They aren't custom chips.  You don't do custom chips for low-volume parts.  Nvidia doesn't even do custom chips for Quadro cards, and that's a huge cash cow.  A different bin of existing chips, yes, but that's far from doing a custom chip.  It might be a special salvage bin with something fused off that the consumer cards need.  Nvidia even explicitly says that they're Kepler GPU chips.

These boards were used in the latest supercomputer -  http://www.engadget.com/2012/11/12/titan-supercomputer-leads-latest-top-500-list-as-newly-available/  so don't say they are not good.

Are you sure that it's using Grid cards as opposed to Tesla?  The latter is Nvidia's variant on the same GK104 GPU chip that was built with supercomputers in mind.

  lizardbones

Hard Core Member

Joined: 6/11/08
Posts: 10953

I think with my heart and move with my head.-Kongos

1/09/13 1:14:42 PM#33


Originally posted by Quizzical

Originally posted by lizardbones  

Originally posted by Quizzical

Originally posted by lizardbones Sometimes I think you are just trying to punch people with words. I'm going to leave all those words in there, even though anyone else who gets through them will have been punched in the brain. Wouldn't it make a lot more sense to have more than one player's game per GPU than to have multiple GPUs working on one player's game? That seems to be what they are attempting to do. Most games do not utilize 100% of a GPUs processing, so being able to capitalize on that would lead to a hardware and power savings, even if it doesn't lead to a huge performance improvement. I'm pretty sure their incredibly boring presentation talked about a power and cost savings, not a huge performance increase. Another thing is they are saying what they have doesn't exist anywhere else. Using multiple GPUs per game already exists with the SLI stuff. Multiple people utilizing a single GPU doesn't exist outside of their hardware. It's virtualized hardware for the GPU. I've not even looked at this type of thing for servers, but when running virtual machines, the virtual GPU hardware is bad. Bad, bad, bad, bad, bad. If they can work out a system where virtual GPUs are garbage, they would have a new product. Much better than an old product that is being used in a suspicious way. Whether or not they can do it is another question entirely. You seem to have questions about whether even one of their GPUs will give decent performance, much less multiple player's games utilizing a single GPU.  
Sure, multiple things running on a single GPU simultaneously makes sense.  A Radeon HD 6970 could do two things at once, and at least some GCN cards can probably do more, so AMD is heading in this direction, too.  GPU virtualization in that way makes sense for some people.   SLI and CrossFire use multiple GPUs to render a game, but they use alternate frame rendering.  Splitting a single frame across multiple GPUs simply isn't practical.  It's technically possible, but it would bring such a large performance hit that there's no point.  Well, maybe at ultra-high resolutions, if a game engine was aware that there were two GPUs and did a bunch of custom stuff on the CPU to balance the load between them, it could kind of work as an alternative to CrossFire or SLI.  But that really can't be done purely in video drivers or tacked onto existing games without making some very fundamental changes to the game engine. ----- One other thought on why they surely aren't planning on splitting a single frame among multiple GPUs:  if they were going to do that, then why not use a higher end GPU instead?  The GPUs in a Grid K1 are basically half of a GK107.  The bandwidth you'd need to make that perform comparably for a single frame might be possible, but it would be so outlandishly expensive that it would be completely stupid to do it that way instead of just using a GK107 and making the entire inter-GPU communication problem vanish.
The best case scenario sounds like the system has to render a frame, and picks the next available GPU to render that frame. The frames are rendered quickly because there's always another physical GPU to render the next frame. Physical GPUs aren't assigned to any specific game or user, only the next available frame. The game, however, sees one virtual GPU and the game doesn't have to worry about it too much or be rewritten to function. They are doing the usual rendering, just very efficiently. I think the big deal is virtualizing the GPUs. If they can successfully virtualize GPUs so that the virtual GPU seen by a game isn't complete garbage, people could boot up their Linux system, and then start a virtual Windows machine to run games, and it would actually work. I'll admit, that's a big stretch since they've targeted servers, but that's one of the unintended results I'd like to see.  
Picking a GPU arbitrarily when you decide it's time to render a frame isn't practical, either, at least for gaming.  What do you do about the hundreds of MB of data buffered in one particular GPU's video memory that you need to render that particular frame?  If you pick a different GPU, then you have to copy all of the data over there.  Do that every frame and you might end up spending as much time copying buffered data around as you actually do rendering frames.  If you only switch GPUs once in a while, or if you're running an office application that doesn't need much data buffered, then it's more viable.

Booting up Linux and then running a virtual Windows machine in it to play games does strike me as the sort of thing that ought to be possible sometime soonish.  Though for that, you'd just want an ordinary GeForce or Radeon card, not a Grid card.




Some more Googling turns up this tidbit from Joystiq:


According to Nvidia, a single Grid server can enable up to 24 HD quality game streams. At CES Nvidia showcased the Grid gaming system, which incorporates 20 Grid servers into a single rack. Nvidia says the rack is capable of producing 36 times the amount of HD-quality game streams as 'first-generation cloud gaming systems.'



Since there are 24 GPUs per unit, and each unit can put out 24 HD quality game streams, it seems like each player gets a dedicated GPU. The cost savings is in not having to buy a separate, physical video card per player. It's still a GPU per player, but the system is designed from the ground up around the idea rather than using off the shelf hardware.

Another thing to note is that their presentation showed the Shield device hooking into the GRID cloud, not just the user's PC. That might be a big deal.

I've been waiting for virtualized graphics for a loooong time. I set my Linux machine aside so I could play video games again and really, that's the only reason I don't run it now.

I can not remember winning or losing a single debate on the internet.

  Quizzical

Guide

Joined: 12/11/08
Posts: 13774

1/09/13 1:36:00 PM#34
Originally posted by lizardbones


According to Nvidia, a single Grid server can enable up to 24 HD quality game streams. At CES Nvidia showcased the Grid gaming system, which incorporates 20 Grid servers into a single rack. Nvidia says the rack is capable of producing 36 times the amount of HD-quality game streams as 'first-generation cloud gaming systems.'



Since there are 24 GPUs per unit, and each unit can put out 24 HD quality game streams, it seems like each player gets a dedicated GPU. The cost savings is in not having to buy a separate, physical video card per player. It's still a GPU per player, but the system is designed from the ground up around the idea rather than using off the shelf hardware.

Another thing to note is that their presentation showed the Shield device hooking into the GRID cloud, not just the user's PC. That might be a big deal.

 

A cynic would point out that 36 times zero is still zero.

A demonstration doesn't mean much in situations like this, as it doesn't let you feel input latency.  Remember Intel showing off their new graphics running a game and it ended up that it was just running a video of a game playing?

  lizardbones

Hard Core Member

Joined: 6/11/08
Posts: 10953

I think with my heart and move with my head.-Kongos

1/09/13 2:00:54 PM#35


Originally posted by Quizzical

Originally posted by lizardbones

According to Nvidia, a single Grid server can enable up to 24 HD quality game streams. At CES Nvidia showcased the Grid gaming system, which incorporates 20 Grid servers into a single rack. Nvidia says the rack is capable of producing 36 times the amount of HD-quality game streams as 'first-generation cloud gaming systems.'
Since there are 24 GPUs per unit, and each unit can put out 24 HD quality game streams, it seems like each player gets a dedicated GPU. The cost savings is in not having to buy a separate, physical video card per player. It's still a GPU per player, but the system is designed from the ground up around the idea rather than using off the shelf hardware. Another thing to note is that their presentation showed the Shield device hooking into the GRID cloud, not just the user's PC. That might be a big deal.  
A cynic would point out that 36 times zero is still zero.

A demonstration doesn't mean much in situations like this, as it doesn't let you feel input latency.  Remember Intel showing off their new graphics running a game and it ended up that it was just running a video of a game playing?




Heh.

Well, they've sold it to six different companies around the world, so if it works, I'm sure we'll know soon enough. Of course Nvidia is going to say it's awesome, but regular people are going to use the service and I expect those people will probably have things like Twitter and what not.

** edit **

I'm not a cynic, but I reserve the right to change my mind at the drop of a hat.

I can not remember winning or losing a single debate on the internet.

  Quizzical

Guide

Joined: 12/11/08
Posts: 13774

1/09/13 2:09:51 PM#36
Originally posted by lizardbones

 


Originally posted by Quizzical

Originally posted by lizardbones

According to Nvidia, a single Grid server can enable up to 24 HD quality game streams. At CES Nvidia showcased the Grid gaming system, which incorporates 20 Grid servers into a single rack. Nvidia says the rack is capable of producing 36 times the amount of HD-quality game streams as 'first-generation cloud gaming systems.'
Since there are 24 GPUs per unit, and each unit can put out 24 HD quality game streams, it seems like each player gets a dedicated GPU. The cost savings is in not having to buy a separate, physical video card per player. It's still a GPU per player, but the system is designed from the ground up around the idea rather than using off the shelf hardware. Another thing to note is that their presentation showed the Shield device hooking into the GRID cloud, not just the user's PC. That might be a big deal.  
A cynic would point out that 36 times zero is still zero.

 

A demonstration doesn't mean much in situations like this, as it doesn't let you feel input latency.  Remember Intel showing off their new graphics running a game and it ended up that it was just running a video of a game playing?




Heh.

Well, they've sold it to six different companies around the world, so if it works, I'm sure we'll know soon enough. Of course Nvidia is going to say it's awesome, but regular people are going to use the service and I expect those people will probably have things like Twitter and what not.

** edit **

I'm not a cynic, but I reserve the right to change my mind at the drop of a hat.

 

Sold it to six companies, but to do what exactly?  If companies are buying it for graphics virtualization in a corporate environment where you don't have to do much more than display the desktop and office applications, that's a long way away from streaming games over the Internet.  Even 1/9 of a fairly weak GPU surely beats no GPU.

And even if companies are buying it to stream games over the Internet, OnLive has bought a bunch of video cards for that in the past, too.  We saw how that turned out.

  lizardbones

Hard Core Member

Joined: 6/11/08
Posts: 10953

I think with my heart and move with my head.-Kongos

1/09/13 2:22:31 PM#37


Originally posted by Quizzical

Originally posted by lizardbones  

Originally posted by Quizzical

Originally posted by lizardbones

According to Nvidia, a single Grid server can enable up to 24 HD quality game streams. At CES Nvidia showcased the Grid gaming system, which incorporates 20 Grid servers into a single rack. Nvidia says the rack is capable of producing 36 times the amount of HD-quality game streams as 'first-generation cloud gaming systems.'
Since there are 24 GPUs per unit, and each unit can put out 24 HD quality game streams, it seems like each player gets a dedicated GPU. The cost savings is in not having to buy a separate, physical video card per player. It's still a GPU per player, but the system is designed from the ground up around the idea rather than using off the shelf hardware. Another thing to note is that their presentation showed the Shield device hooking into the GRID cloud, not just the user's PC. That might be a big deal.  
A cynic would point out that 36 times zero is still zero.   A demonstration doesn't mean much in situations like this, as it doesn't let you feel input latency.  Remember Intel showing off their new graphics running a game and it ended up that it was just running a video of a game playing?
Heh. Well, they've sold it to six different companies around the world, so if it works, I'm sure we'll know soon enough. Of course Nvidia is going to say it's awesome, but regular people are going to use the service and I expect those people will probably have things like Twitter and what not. ** edit ** I'm not a cynic, but I reserve the right to change my mind at the drop of a hat.  
Sold it to six companies, but to do what exactly?  If companies are buying it for graphics virtualization in a corporate environment where you don't have to do much more than display the desktop and office applications, that's a long way away from streaming games over the Internet.  Even 1/9 of a fairly weak GPU surely beats no GPU.

And even if companies are buying it to stream games over the Internet, OnLive has bought a bunch of video cards for that in the past, too.  We saw how that turned out.




One of the companies is Ubitus, who are working with Google to provide cloud based gaming. Another one is named PlayCast, which certainly sounds like a gaming company. At least one company will provide cloud based gaming services for sure (Ubitus).

I think the key selling point is the cost. This is supposed to be cheaper than what OnLive tried to do. Not only are the systems cheaper, but they are fairly energy efficient as well. You get more video power than the equivalent number of Xbox 360s, but at 1/5th the power consumption. OnLive struggled against trying something new, and using nearly off the shelf hardware.

That doesn't mean it'll work. The service isn't going to be free.

I can not remember winning or losing a single debate on the internet.

2 Pages « 1 2 Search