First of all, calculating the winding number on a per-pixel basis is frotzing slow, and it requires pretty intense floating-point math (cross-products and such). This does not lend itself very well to being deferred to raster-scan time as you are (apparently) proposing.
Second, although doing edgelists is slow, tesselating a complex polygon down into basic triangles is somewhat quicker. It's still O(n^2), but you only have to do it once.
Third, using winding numbers gives you absolutely no way to do simple things such as
Fourth, there are many other interesting mechanisms to consider. For example, raycasting against the intersections of infinite planes (which is how the PowerVR chipset, used in the Dreamcast and Neon250 among others, does all its rendering). This mechanism also eliminates useless overdraw (since each pixel is raycast only until it hits a non-transparent surface), and leads to useful side-effects such as "infinitely"-precise depth buffering (in fact, no zbuffer is required), implicit infinite-precision stencil buffers, and other coolness. You really should concentrate on that instead. In addition, the raycasting can easily happen at scan-time, eliminating the need for a backbuffer.
- Gouraud shading
- Anything else which requires interpolation across the surface of the polygon
Fifth, there are other mechanisms by which the blitting can be deferred until the last possible instance which are much more efficient, such as span-buffering (which is used in Unreal Tournament), and would be great to be implemented in hardware.
Sixth, why would complex polygons help anything out anyway? Any complex polygon can be tesselated down into convex polygons, which can be handled very efficiently by an O(ysize)-time edge scan, and you can't define an entire 3D scene by a single complex polygon, at least not without a LOT of preprocessing (which would be a lot slower than just software rendering - hell, a lot slower than raytracing).
Then again, judging by your multiply-linked-to diary rant, you think that hardware acceleration is the tool of the devil, and that any sort of abstract API is a bad, unartistic thing. I can't argue with illogic like that.
As others have pointed out, the only real bottleneck in 3D rendering right now is the bus between the CPU and the video card, and nVidia's "vertex shaders" (horrible term IMO) do a lot to help with that, as do a lot of tried-and-true techniques such as compiled displaylists and other things which are standardized by things like OpenGL, in an abstract way.
[Editorial note: I can't believe this article's about to be posted.]
"Is not a quine" is not a quine.
I have a master's degree in science!
[ Hug Your Trikuare ]