If I remember correctly from reading a post on Notch's blog a couple years ago that he had most likely written years before, Minecraft terrain gen works something like this (or at least worked something like this before the changes in Beta 1.8):
1. First, a 3D perlin noisemap is generated to assign some of the blocks in the world a "density" value between 0 and 1. The distribution is centered around 0.5, so approxamately half of the blocks have density > 0.5, and the other half have density < 0.5.
1a. Actually, the generator only samples the Perlin noisemap for a small fraction of blocks (those whose coordinates are multiples of 4 or 8, or something like that) and does a linear interpolation from there. This both smoothes the terrain some (less random single floating blocks everywhere) and makes it run faster.
2. Then, in the Overworld, a number is added to the density of each block based on the block's height. This number is 0 at y-level 64 and scales linearly with the height from there. I don't know the exact details, but I would guess that it hits +1 somewhere around y-level 32, and -1 somewhere around y-level 96. This step is skipped in the Nether, and in the End, I'm guessing that rather than height, the density skewing is based on distance from (0, 0, 64).
3. At this point, I'm guessing, the generator is further skewed based on biome. Densities are lowered in oceans, raised in mountains, flattened somehow in plains, etc.
4. Now, all blocks with density > 0.5 are turned into stone, and all blocks with density < 0.5 are turned into air. This way, all the blocks at high y-levels due to step 2 are made air, and all the lower blocks are made stone. Y-level 64 is still about half air and half stone. However, since step 2 is skipped in the Nether, it remains relatively uniform throughout. There's probably some extra code in there to make sure that the top and bottom few layers are solid stone/netherrack.
5. Finally, some of the stone is turned into ores, dirt, grass, caves, etc, just as it is now.
An open-roofed caveworld could be as simple as finding the parameter in step 2 of the Overworld code, and modifying it so that it doesn't reach -1 until y-level 256.