My storage estimate for a full-eye resolution 3D movie
Moderator:Moderators
- benheck
- Site Admin
- Posts:1880
- Joined:Wed Dec 31, 1969 6:00 pm
- Location:Wisconsin - Land of Beer and Cheese and Beer
- Contact:
I saw Space Station 3D the other night at the local IMAX (it rocks, 5 minutes from me!) and it's got me thinking about 3D movies and stereo vision.
Checking How Stuff Works, it appears each eyeball has 100 million brightness sensors (rods) and 7 million color sensors (cones). Each color sensor detects either red, blue or green. Hence, the eye is MUCH more sensitive to light/detail than color, which is why video compression usually throws out half to a quarter of the color samples. (4:1:1 compression)
So that's effectively 107 million "samples" per eyeball. Multiply that by 2 (you know, both eyes) and you get 214 million samples.
Say someone wanted to make a movie in "Full Eye Resolution" Since an eye is round, that makes it approximately 10.2 million X 10.2 million pixels, per eye. That would be pretty insane as the most bleeding-edge digital film cameras right now top off at 4096x2048. Maybe there's no reason too, but I like crunching numbers. Correct me if I'm wrong...
214,000,000 samples, consider 1 byte per sample (range per sample 0-255) For every 14 luma samples there's 1 color sample (either R G or B). Consider a 1 byte value for each color sample as well. (Red 0-255, Green 0-255, Blue 0-255)
So 214 million samples, at 8 bits a sample, would be 214 megabytes. Per frame. So, for a really cool 60 frames-per-second movie (that would be like "real life")...
200 MB * 60 FPS = 12 gigbytes per scond
12 * 60 = 720 gigabyte per minute
720 * 120 = 86,400 gigabytes (86.4 terabytes) for a 2 hour movie.
I wonder what a movie like this would look like... I wonder if any machine could possibly pull it off?
Checking How Stuff Works, it appears each eyeball has 100 million brightness sensors (rods) and 7 million color sensors (cones). Each color sensor detects either red, blue or green. Hence, the eye is MUCH more sensitive to light/detail than color, which is why video compression usually throws out half to a quarter of the color samples. (4:1:1 compression)
So that's effectively 107 million "samples" per eyeball. Multiply that by 2 (you know, both eyes) and you get 214 million samples.
Say someone wanted to make a movie in "Full Eye Resolution" Since an eye is round, that makes it approximately 10.2 million X 10.2 million pixels, per eye. That would be pretty insane as the most bleeding-edge digital film cameras right now top off at 4096x2048. Maybe there's no reason too, but I like crunching numbers. Correct me if I'm wrong...
214,000,000 samples, consider 1 byte per sample (range per sample 0-255) For every 14 luma samples there's 1 color sample (either R G or B). Consider a 1 byte value for each color sample as well. (Red 0-255, Green 0-255, Blue 0-255)
So 214 million samples, at 8 bits a sample, would be 214 megabytes. Per frame. So, for a really cool 60 frames-per-second movie (that would be like "real life")...
200 MB * 60 FPS = 12 gigbytes per scond
12 * 60 = 720 gigabyte per minute
720 * 120 = 86,400 gigabytes (86.4 terabytes) for a 2 hour movie.
I wonder what a movie like this would look like... I wonder if any machine could possibly pull it off?
-
- Posts:1093
- Joined:Mon Apr 25, 2005 8:52 am
I propose you do one one those group computing things to use all the computers in the worl to attempt to encode a 1 minute clip (Of course it would need to be produced from vector images to aboice pixelation).
Then we all go get a 360,
Or
999999999999999999999999999999999999999999 PS3's and its low GPP
Or
infinity revolutions to try and play it.
Or
of course just get a dual core AMD
Yeah your calculations actually sound like to low an amount of data. Id of expected it to be at some where in a yottabyte area. Well the human eye can see more frames than 60 just our brain sort of blurs them together. I can see the frames changing if i concentrate at a computer game which is at 60fps. Cats can see at like 200fps and it still looks like a slide show.
Then we all go get a 360,
Or
999999999999999999999999999999999999999999 PS3's and its low GPP
Or
infinity revolutions to try and play it.
Or
of course just get a dual core AMD
Yeah your calculations actually sound like to low an amount of data. Id of expected it to be at some where in a yottabyte area. Well the human eye can see more frames than 60 just our brain sort of blurs them together. I can see the frames changing if i concentrate at a computer game which is at 60fps. Cats can see at like 200fps and it still looks like a slide show.
- Triton
- Moderator
- Posts:7397
- Joined:Mon May 24, 2004 12:33 pm
- 360 GamerTag:triton199
- Steam ID:triton199
- Location:Iowa
- Contact:
mebe a seriously hevy duty distributed computing network, like a few HUGE servers, quad 4ghz processors etc, and some massive hdds, thats about 3/4 a TB a minute, wow
Visit us at Portablesofdoom.org
-
- Posts:1093
- Joined:Mon Apr 25, 2005 8:52 am
Well as of now we don't have a device to capture video at anywhere close to that resolution. So we won't expect to see stuff at this resolution for maybe 20 years and at that rate we can expect computers to be anywhere from 30 to 60 ghz or most likely more if we take into account the less than 1 khz computers that came out over 20 years ago.
If we have 250 GB hard drives available for dirt cheep and people with 4 Gigs of ram in there PCs in 20 years we can expect for people to be able to go out and pick up a 512 or even a 728 Gigabyte memory stick and put 4 in there computer and Hard drives easily into at least 10 Terabytes and more than likely more.
All you would need is a dual 60 ghz prosser in maybe a shared network with 10-20 drives like in a server bank. and have someway to sync up the video loaded from one drive to another without a noticeable time between the HDs.
Either way, it will happen and very soon we could all be watching HHDTV, with resolutions better than real life (even tho thats dumb because we can't see better than real life).
If we have 250 GB hard drives available for dirt cheep and people with 4 Gigs of ram in there PCs in 20 years we can expect for people to be able to go out and pick up a 512 or even a 728 Gigabyte memory stick and put 4 in there computer and Hard drives easily into at least 10 Terabytes and more than likely more.
All you would need is a dual 60 ghz prosser in maybe a shared network with 10-20 drives like in a server bank. and have someway to sync up the video loaded from one drive to another without a noticeable time between the HDs.
Either way, it will happen and very soon we could all be watching HHDTV, with resolutions better than real life (even tho thats dumb because we can't see better than real life).
Re: My storage estimate for a full-eye resolution 3D movie
Actually, you'd have to multiply 100 million times 7 million since it's 100 million brightness levels of each of the 7 million colors. Then, as you said, multiply times 2.benheckc wrote:...it appears each eyeball has 100 million brightness sensors (rods) and 7 million color sensors (cones)...
So that's effectively 107 million "samples" per eyeball.
1,400,000,000,000,000 "samples"
Or did I think wrongly?
-Luke
When the PS9 comes out I'm sure we'll be able to make movies with that kind of resolution and complexity.
Also to put into perspective how much computing power you'd need to render the graphics. Matrix used clusters of about 8 computer CPUs and took hours to do the special effects like Trinity's Jump kick, and then her leap through the window on the chance by the agent.
Now if you wanted to get into 3D movies. I'd suggest getting the second DVD that comes with the 2004 released Appleseed. They talk about the live action capture, designing the city and objects, and all the complex issues that followed that. Be warned you will be reading subtitles, no english dub, they're interviewing the staff.
Also really if you want to do something new with movies, something that wont eat up terihertz of CPU power. 5.1 audio, movies like Monster have their music rendered in that format and it adds drasticly to the cinimatic experiance.
Also to put into perspective how much computing power you'd need to render the graphics. Matrix used clusters of about 8 computer CPUs and took hours to do the special effects like Trinity's Jump kick, and then her leap through the window on the chance by the agent.
Now if you wanted to get into 3D movies. I'd suggest getting the second DVD that comes with the 2004 released Appleseed. They talk about the live action capture, designing the city and objects, and all the complex issues that followed that. Be warned you will be reading subtitles, no english dub, they're interviewing the staff.
Also really if you want to do something new with movies, something that wont eat up terihertz of CPU power. 5.1 audio, movies like Monster have their music rendered in that format and it adds drasticly to the cinimatic experiance.
vskid wrote:Nerd = likes school, does all their homework, dies if they don't get 100% on every assignment
Geek = likes technology, dies if the power goes out and his UPS dies too
I am a geek.
-
- Posts:1093
- Joined:Mon Apr 25, 2005 8:52 am
Just a quick comment (this is a topic which would lend itself better to a conversation than to typed comments):
One has to remember that there is a significant (huge) amount of "post-processing" done in the brain after the rods and cones detect light. From simple things like "flipping the image" so it is right side up to complex things like combining two images in a way that generates a perception of depth. Also included are the brain's learned ability to quickly extract hard edges from what we see and to focus (mentally, not physically) on certain parts of a viewed image while blocking out (but, importantly, not totally ignoring) the periphery.
There is much more where that came from , I am sure (I am not an expert on these things). But my point is that even though there is a huge amount of data being received by the eye, even more "detail" is being created by the manner by which the image is intepreted by the brain. Not that this "detail" is alway correct - have you ever seen one of those spirals that you stare at for a while and then look at your friend's head which starts to shrink (or expand)? Or check out this site. Or this one. The human brain acts like one of those "step-up" DVD players that convert movies from 480 to 1020 via image interpolation, not by actually knowing all of the the 1020 image information.
But this doesn't really affect Ben's "FE" Res idea (well, maybe the direct/peripheral vision stuff does). However, I don't believe that when we are walking around looking at stuff (or the BenHeck forums) we are using all of our rods and cones. What I mean is that all of the rods and cones might be receiving light and sending a signal to the brain, but the brain might be ignoring a big part. I am not totally sure of this, but it seems correct - and here is why: Ever notice how the most difficult time of the day to see is at dusk? This is because our brain isn't sure if it should be using color vision (primarily cones), or light/dark vision (primarily rods). During the day, we use color vision, while at night, we (generally) use dark/light vision. I think the brain automatically ignores a lot of the raw visual information it is recieiving based on the characteristics of it - if it is "bright" (or the cones are sending a lot of good, quality data), then it says "I think I'll use the cones here; rods - you are riding shotgun." On the other hand, if it is "dark," then it says "cones, you are looking a little weak right now, y'all are on the bench; rods! - get out a play!" Or something like that. So my point is that to mimic a FE Res, you wouldn't need to match the full "max resolution" of the eye, but (given the lightness of the image) how many rods/cones the brain would use to view the image.
Sorry to have rambled so long.
One has to remember that there is a significant (huge) amount of "post-processing" done in the brain after the rods and cones detect light. From simple things like "flipping the image" so it is right side up to complex things like combining two images in a way that generates a perception of depth. Also included are the brain's learned ability to quickly extract hard edges from what we see and to focus (mentally, not physically) on certain parts of a viewed image while blocking out (but, importantly, not totally ignoring) the periphery.
There is much more where that came from , I am sure (I am not an expert on these things). But my point is that even though there is a huge amount of data being received by the eye, even more "detail" is being created by the manner by which the image is intepreted by the brain. Not that this "detail" is alway correct - have you ever seen one of those spirals that you stare at for a while and then look at your friend's head which starts to shrink (or expand)? Or check out this site. Or this one. The human brain acts like one of those "step-up" DVD players that convert movies from 480 to 1020 via image interpolation, not by actually knowing all of the the 1020 image information.
But this doesn't really affect Ben's "FE" Res idea (well, maybe the direct/peripheral vision stuff does). However, I don't believe that when we are walking around looking at stuff (or the BenHeck forums) we are using all of our rods and cones. What I mean is that all of the rods and cones might be receiving light and sending a signal to the brain, but the brain might be ignoring a big part. I am not totally sure of this, but it seems correct - and here is why: Ever notice how the most difficult time of the day to see is at dusk? This is because our brain isn't sure if it should be using color vision (primarily cones), or light/dark vision (primarily rods). During the day, we use color vision, while at night, we (generally) use dark/light vision. I think the brain automatically ignores a lot of the raw visual information it is recieiving based on the characteristics of it - if it is "bright" (or the cones are sending a lot of good, quality data), then it says "I think I'll use the cones here; rods - you are riding shotgun." On the other hand, if it is "dark," then it says "cones, you are looking a little weak right now, y'all are on the bench; rods! - get out a play!" Or something like that. So my point is that to mimic a FE Res, you wouldn't need to match the full "max resolution" of the eye, but (given the lightness of the image) how many rods/cones the brain would use to view the image.
Sorry to have rambled so long.