Drawing triangle (or in general convex polygon, but as I said we will use only triangles) is very simple. The basic idea of the line triangle drawing algorithm is as follows.
For each scanline (horizontal line on the screen), find the points of intersection
with the edges of the triangle. Then, draw a horizontal line between intersections.
Do this for all scanlines and you are done.
But how can we find these points quickly?? Using linear interpolation!
We have 3 vertices and we want to find coordinates of all points belonging to segments determined
by these vertices.
Assume we have segment given by points: (xa,ya) and (xb,yb). Our task is to find points:
(xc,ya+1), (xd,ya+2), ... , (xm,yb1), (xn,yb).
Notice that xa changes to xb in (ybya) steps.
We also have: xa=xa+0*(xbxa)/(ybya),
xb=xa+(ybya)*(xbxa)/(ybya) and, in general,
xi=xa+(yiya)*delta, where delta=(xbxa)/(ybya).
The general function for linear interpolation is:
f(X) = A + X*((BA)/steps)
where we slide from A to B in steps steps
Here is pseudocode for a triangle filling algorithm.

*** begin triangle filler *** if (B.yA.y > 0) dx1=(B.xA.x)/(B.yA.y) else dx1=0; if (C.yA.y > 0) dx2=(C.xA.x)/(C.yA.y) else dx2=0; if (C.yB.y > 0) dx3=(C.xB.x)/(C.yB.y) else dx3=0; S=E=A; if(dx1 > dx2) { for(;S.y<=B.y;S.y++,E.y++,S.x+=dx2,E.x+=dx1) horizline(S.x,E.x,S.y,color); E=B; for(;S.y<=C.y;S.y++,E.y++,S.x+=dx2,E.x+=dx3) horizline(S.x,E.x,S.y,color); } else { for(;S.y<=B.y;S.y++,E.y++,S.x+=dx1,E.x+=dx2) horizline(S.x,E.x,S.y,color); S=B; for(;S.y<=C.y;S.y++,E.y++,S.x+=dx3,E.x+=dx2) horizline(S.x,E.x,S.y,color); } *** end triangle filler *** 
I ought to explain what is the comparision dx1 > dx2 for. It's optimization trick: in the horizline routine, we don't need to compare the x's (S.x is always less than or equal to E.x).
The idea of gouraud and flat triangle is nearly the same. Gouraud takes only three parameters more (the color value of
each of the vertices), and the routine just interpolates among them drawing a beautiful, shaded triangle.
You can use 256colors mode, in which vertices' colors are simply indices to palette or hicolor mode (recommended).
Flat triangle interpolated only one value (x in connection with y), 256 colors gouraud needs three (x related to y, color related to y, and color related to x), hicolor gouraud needs seven (x related to y, red,green,blue component of color related to y, and color related to x (also three components))
Drawing a gouraud triangle, we add only two parts to the flat triangle routine. The horizline routine gets a bit more complicated due to the interpolation of the color value related to x but the main routine itself remains nearly the same.
I'll give you a full gouraud routine because good pseudocode is better than the best description :)
*** begin gouraud triangle filler *** if (B.yA.y > 0) { dx1=(B.xA.x)/(B.yA.y); dr1=(B.rA.r)/(B.yA.y); dg1=(B.gA.g)/(B.yA.y); db1=(B.bA.b)/(B.yA.y); } else dx1=dr1=dg1=db1=0; if (C.yA.y > 0) { dx2=(C.xA.x)/(C.yA.y); dr2=(C.rA.r)/(C.yA.y); dg2=(C.gA.g)/(C.yA.y); db2=(C.bA.b)/(C.yA.y); } else dx2=dr2=dg2=db2=0; if (C.yB.y > 0) { dx3=(C.xB.x)/(C.yB.y); dr3=(C.rB.r)/(C.yB.y); dg3=(C.gB.g)/(C.yB.y); db3=(C.bB.b)/(C.yB.y); } else dx3=dr3=dg3=db3=0; S=E=A; if(dx1 > dx2) { for(;S.y<=B.y;S.y++,E.y++) { if(E.xS.x > 0) { dr=(E.rS.r)/(E.xS.x); dg=(E.gS.g)/(E.xS.x); db=(E.bS.b)/(E.xS.x); } else dr=dg=db=0; P=S; for(;P.x < E.x;P.x++) { putpixel(P); P.r+=dr; P.g+=dg; P.b+=db; } S.x+=dx2; S.r+=dr2; S.g+=dg2; S.b+=db2; E.x+=dx1; E.r+=dr1; E.g+=dg1; E.b+=db1; } E=B; for(;S.y<=C.y;S.y++,E.y++) { if(E.xS.x > 0) { dr=(E.rS.r)/(E.xS.x); dg=(E.gS.g)/(E.xS.x); db=(E.bS.b)/(E.xS.x); } else dr=dg=db=0; P=S; for(;P.x < E.x;P.x++) { putpixel(P); P.r+=dr; P.g+=dg; P.b+=db; } S.x+=dx2; S.r+=dr2; S.g+=dg2; S.b+=db2; E.x+=dx3; E.r+=dr3; E.g+=dg3; E.b+=db3; } } else { for(;S.y<=B.y;S.y++,E.y++) { if(E.xS.x > 0) { dr=(E.rS.r)/(E.xS.x); dg=(E.gS.g)/(E.xS.x); db=(E.bS.b)/(E.xS.x); } else dr=dg=db=0; P=S; for(;P.x < E.x;P.x++) { putpixel(P); P.r+=dr; P.g+=dg; P.b+=db; } S.x+=dx1; S.r+=dr1; S.g+=dg1; S.b+=db1; E.x+=dx2; E.r+=dr2; E.g+=dg2; E.b+=db2; } S=B; for(;S.y<=C.y;S.y++,E.y++) { if(E.xS.x > 0) { dr=(E.rS.r)/(E.xS.x); dg=(E.gS.g)/(E.xS.x); db=(E.bS.b)/(E.xS.x); } else dr=dg=db=0; P=S; for(;P.x < E.x;P.x++) { putpixel(P); P.r+=dr; P.g+=dg; P.b+=db; } S.x+=dx3; S.r+=dr3; S.g+=dg3; S.b+=db3; E.x+=dx2; E.r+=dr2; E.g+=dg2; E.b+=db2; } } *** end gouraud triangle filler *** 
I hope you are familiar with idea of interpolation now, because I'm not going to give you another pseudocodes. You should do something by yourself, too, don't you think? :)
I'll show you the idea of linear (or 'classical') texture mapping (without perspective correction). Linear mapping works pretty well (read: fast) in some scenes, but perspective correction is in some way needed in most 3D systems.
Anyway: Again we're using the idea of interpolation: now we'll code a texture triangle filler. And again the idea is perfectly the same, only two more values to interpolate, that is five values total. In texture mapping, we interpolate x, u, and v related to y, and u and v related to x (u and v are coordinates in the 2D bitmap space). The situation is maybe easier to understand by looking at the following picture:
The left triangle is the triangle which is drawn onto the screen. There's a single scanline (one call to the horizline routine) pointed out as an example. The triangle on the right is the same triangle in the bitmap space, and there's the same scanline drawn from another point of view into it, too. So we need just to interpolate, interpolate, and once more interpolate in a texture filler  an easy job if you've understood the idea of a gouraud filler.
An optimization trick: the color deltas in gouraud and (u,v) coordinate deltas in texture remain constant, so we need to
calculate them only once per polygon. Let's take the u delta in linear texturing as an example. Assume, that
dx2<=dx3 (we are using the same symbols like in flat and gouraud filler).
As we know, we need to interpolate S.u to E.u in the horizline routine in (S.xE.x) steps.
We are in the need of a u delta (du) which would be the same for the whole polygon. So instead of calculating in
each scanline this:
du = (E.uS.u) / (E.xS.x),
we do like this in the setup part of the polygon routine:
We know that
S.x = A.x + (B.yA.y)*dx1,
S.u = A.u+(B.yA.y)*du1,
E.x = B.x = A.x+(B.yA.y)*dx2,
E.u = B.u = A.u+(B.yA.y)*du2,
WHEN y=B.y (when y is the ycoordinate of the second vertex).
When we place the values of the variables S.u,E.u,S.x and E.x (above) to the u delta statement,
du = (E.uS.u) / (E.xS.x),
we get the following statement as a result:
[A.u+(B.yA.y)*du2]  [A.u+(B.yA.y)*du1] du =  [A.x+(B.yA.y)*dx2]  [A.x+(B.yA.y)*dx1] (B.yA.y)*(A.uA.u+du2du1) <=> du =  (B.yA.y)*(A.xA.x+dx2dx1) du2du1 <=> du =  dx2dx1 outerUdelta2outerUdelta1 <=> innerUdelta =  outerXdelta2outerXdelta1
Nice! But what if dx2 = dx1? This of course means that the polygon is just one line, so du doesn't need any specific value; zero does the job very well.
Note! I find it hard to get good results using fixed point math because of inadequate precision (if you don't know what fixed point math is, forget about this note :)
As I said in 'shading' part, the way demos do environment mapping is very simple. Take the X and Y components of your pseudonormal vectors (perpendicular
to verices), and use them to index your texture map!. Very simple indeed. Your formulae would be:
U = N.x*128 + 127
V = N.y*128 + 127 (assuming 256x256 texture maps).
Or in general
U = N.x * (width / 2) + (width / 2)  1
V = N.y * (height / 2) + (height / 2)  1
Using texturing and shading at the same time is quite straightforward to implement: the basic idea being that we just interpolate the values of both texture and shade and blend them in a suitable ratio (alphablending).