Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - harold

Pages: 1 ... 9 10 [11] 12 13 ... 16
151
Computer Projects and Ideas / Writing a compiler [tutorial]
« on: February 15, 2013, 07:27:00 pm »
Compiler tutorials often combine a overly-simple source language with a useless target. This one will combine a "sufficiently C-like language" with x64 (AMD64) as a target, so you can actually compile some interesting code with it and actually run it on your computer (no emulator needed).

How this tutorial works: first, general information on how to write a compiler like one of my compilers. Then, more about what I actually did in one of my compilers.

Some knowledge of x64 assembly may come in handy.

The Architecture

To keep things simple, the code-gen will be almost directly after the parser, with no significant optimizations or even type checking. Maybe I'll get into those later.

The picture of the architecture then becomes this:
Code: [Select]
input source code -> [parser] -> AST -> [semantic analysis] -> annotated AST -> [code-gen] -> blocks of code -> [linker] -> .exe fileAST means Abstract Syntax Tree.

The Parser

Compiler courses at universities tend to focus on parsing a lot, because it has a lot of "interesting" theory to talk about. In real life though, you don't build your own parse-tables, you use a parser generator. Parsing is, essentially, a solved problem. I use Coco/R a lot, because I like to work in C# and it's one of the few good parser generators that come with a plug-in for Visual Studio. In any case, it doesn't matter how you decide to parse really, the end result will probably an Abstract Syntax Tree.

Abstract Syntax Trees

The Abstract Syntax Tree you will get out of the parser is tree structure of the input source code. It contains all the relevant information from the source in a format more suitable for a computer.

Semantic Analysis

In serious compilers, you'd do a lot more here; such as type checking and checking for other errors (Definite Assignment, for example). But to get a functioning compiler, all you need is to bind name-uses to their definitions (strictly speaking you don't even need this, but having it allows your language to make sense). This is where things finally get non-trivial enough to warrant an explanation.

Unless you used "The Parser Hack" (mixing the Symbol Table up with the parser), names in the AST, such as variable names and function names, are textual. But in the rest of the compiler, you'll need to know what the actual thing they refer to is, not just the name. That's easy enough to fix, though.

Walk through the AST, and for every "scope", you make a new symbol table and push it on a stack.
For everything that declares anything, put that thing in the top-most symbol table.
For every use of a name, search for that name in the stack of symbol tables. Use the definition closest to the top of the stack.

Practical considerations:
- the stack can be The Stack, by doing it recursively.
- though slightly "dirty", using a writable field in the AST nodes that represent variable uses and function calls to store a reference to the AST node that declared them, works great.
- using a map<string, Node> (or equivalent) for the symbol tables makes sense, and I see no reason not to do it.
- function arguments should have their declaration be active inside the scope of their functions, not in the global scope, unless you really want your language to not make sense.

Code-gen

This is where the trouble starts. Code-gen is pretty hard to get through, and this is, I think, where most compiler projects get stuck. Number 1 reason: trying to generate fast code. Everyone likes fast code, but you'll get into lots of trouble - Register Allocation is just about doable (though nontrivial), but you'd need to match tree-patterns with a huge database of patterns to get some nice code. I like fast code too, though, so this is exactly what I'll be doing in my next compiler, and maybe I'll write a short tutorial for that when I get it working.

But for now, there will be slow code. Stack machine code, in fact. But AMD64 isn't a stack machine! And that's really OK, that just means there would be explicit pushes and pops. But on 64bit Windows, you shouldn't do that. It's not like you can't, but you shouldn't, and I'll get into why when I talk about the Linker. But fortunately, you'll only ever use a statically determinable amount of stack, at most the maximum depth of an expression. At most, but it can be less - using the stack for everything would be so wasteful that it would make me cry, so I keep the top of the stack in RAX. The offset on the stack where a temporary value should go is easily determined by keeping track of what the stack-depth would have been if you had used pushes and pops.

Generating code for a binary (2-operand) expression is easy then: generate the code for the Right Hand Side, save the value in the stack at the right offset, generate code fore the Left Hand Side, then use the appropriate "op RAX, [RSP + offset]" instruction.
That last part should make clear why the Right Hand Side comes first: the memory operand has to be the Left Hand Side (otherwise the result would go back into memory, instead of into RAX).
An optimization is immediately obvious: generate "op RAX, constant" if the Left Hand Side is a constant (of the RHS and the operation is commutative). Feel free do it - I did.

Generating code for a variable may require some trickery, if you have array-types that should "decay" into pointers for example, using a variable of that type should give its address instead of its value.
For your usual local variables, you'd generate a "mov RAX, [RBP + offset]". But what should the offset be? Well, there an extra walk over the AST comes in, one that sums all the sizes of the local variables and remembers what the total size was when a variable was declared, that size will be its offset (you can use smarter layouts there, but this works). In this walk over the AST, you should also keep track of the maximum number of arguments to functions that you call, because you'll need it.

Generating code for a constant is trivial: "mov RAX, constant"

Generating code for assignments. Suddenly the Left Hand Side is not a value, but a location. It's possible to use a special LValue code-gen for that, but I just special-cased everything right there. In the language I compile there isn't much to choose from anyway: local scalar variables (including function arguments), local array dereference or pointer dereference. (no globals, though they wouldn't be hard to add)

Generating code for "return" is interesting. Code should be generated for its argument (if it has one), but then what? How do you return? Not "ret", or you'd skip the epilogue of the function. A jump would need to know the offset to the end.. but that isn't known yet. So a "sort of like half-a-stage" at the very end of the code-gen takes care of that.
Practical: you need to know the label of the end of this function. For this reason and others, you need to know the current function, so keep track of it somehow.

Generating code for "if" and "while". Again labels are needed, and the Linker will deal with them. The new and interesting bit here is conditions. It's possible to use the normal expression code-gen and then do a "test rax, rax", and always use the Z flag. But the code generated that way sucks too much for me, so I wrote a special code-gen for conditions. Tip: only the root node of the AST of the condition is really special, and you can have a rule that it must be a comparison. That means the special condition code-gen only has one method in it.

Generating code for functions. The body is trivial when the have the rest done, it's the prologue and epilogue that are the problem. You should have the "maximum stack usage by expressions" somewhere, and the "space used by locals", and the "maximum number of arguments to functions that you call" (multiply that by 8). Don't forget to align the stack by 16, otherwise Windows functions that you call will likely crash.
Also, to simplify the codegen for using variables and calling functions, the arguments to this function should be saved to their home locations on the stack. Optionally a frame pointer can be used, you don't really need it but I chose to use it because the generated code is a bit easier to understand and debug (bugs in the code-gen will happen, so that's not unimportant).
"return" needed a label to just before the epilogue, so actually put that label there.

Generating code for strings: "lea RAX, [rip+somelabel]". Tell the linker about the string and this usage, though. It has to put it in your data segment and patch up the label reference.

Generating code for imports: you don't. They just have a label. You don't even have to tell the Linker about this, if it is ever called then the call can do that.

Generating code for function calls: mostly trivial, except when calling imports. They have to be called as "call [rip+label]", and you have to tell the Linker about it. Names were bound, so at this point you can distinguish internal calls for calls to imports, and you know the details of the import, such as which dll it was from (because you have its declaration node), so you can do that actual importing in the call node. This means unused imports won't be imported.

You can make up the rest (unary operators, the special handling of division and modulo, for loops, etc), it's just more of the same.

Blocks of Code and the "sort of like half-a-stage"

At this point, you really don't want to be stuck with textual assembly code, you'd only have to parse it again. I just made a giant list of functions like "add_q_r_r(R r, Rrm)" that output the right bytes into a buffer, and then collect a bunch of buffers. Every "block" has a buffer, an optional label, and an optional "target label".

The "half stage" then walks over the blocks, and does a couple of things:
- it tries to fix up jumps. But what if they're not within range? Well if you used the 32bit-offset versions, then you are now in trouble because the offset is just too big to make sense. If you used 8bit-offsets, then you may now have to go back and change the instruction and the length of the block. While resizes keep happening, keep iterating this procedure (a resize may cause a jump target to fall out of range)
- when all the sizes are set in stone, fix up the actual offsets of jumps and calls to internal functions.

Linker

The separate compilation model is for 99% a relic of the past. So the "Linker", or what's left of it, lost the task of gathering stuff from objects files. In fact, there are no object files.
What it does, then, is:
- create import and exception directories
- patch up import references (and data references, if you have, say, string literals)
- lay out the program code and data etc in virtual memory
- write it all into a valid .exe file

Generating the import directory is easy when you know the structure of it. The exception directory is trickier though. It has to know, for every function, where it begun and where it ended, and how much stack it allocated, and which (if any) register it used as frame pointer, and what offset it had relative to RSP, and most annoyingly of all, the exact sequence of events in the prologue.
You need a lot of information that only the function that handles the AST node for functions has, so that's where I put it into an object and give it to the Linker. It just puts it in a list until its stage runs.

Patching up import references. To do this, you need the big list of places where imports were called, and the import directory you just made. Simply write the offsets into the code.

Laying stuff out in virtual memory is simple, just put everything in their segments, and align them. By the segment alignment that is, not by the file alignment.

Writing it out into a valid .exe seems hard but isn't really, just study the structure here. In theory there's a huge number of fields that you could fill in, but most of them aren't necessary to make the program run. Remember to align sections that you're writing by the file alignment, though, which is typically lower than the segment alignment (in other words, in virtual space you have more padding than there will be in the file).

More

I will probably expand this part quite a few times. There's a lot I could write, but most of it seems irrelevant to me right now. Be sure to ask things - I've found that I explain things better when asked about them, compared to when I'm just monologuing.

One thing I did not mention above is that I have types. Kind of. The types are relevant, for example, in arrays and pointer-dereference. Strings pretend to be pointers to char. Pointers use 64bit math.

Also, I have an "op RAX, [RBP + offset]" optimization, and something similar for array accesses, and I reverse commutative ops when it helps, to save quite a lot of code that I'd have to wade through when debugging.

Constant folding and using trivial identities (eg x*0=0). The way I do this, is by giving every expression-node a virtual method Optimize that by default returns "this", but many sub-types override it with a method that return a constant if the result of that node is (locally - ie it does not track whether a variable is a known constant) known to be a constant.

152
ASM / Re: Opcodes that should have been incorporated into the Z80 processor
« on: February 08, 2013, 07:39:29 am »
  • ldmsr reg16 11101101 11[reg16]0000 put reg16 into the model specific register at the index specified by register A.
  • stmsr reg16 11101101 11[reg16]1000 stores the contents of the msr indexed by A in reg16.
  • cpuid 11101101 01110111 puts cpuid data indexed by register A in A, and sets the carry if A <= max cpuid index (otherwise leave the carry and A unchanged)
These encodings made sense to me, but maybe there are better ones.

As for the MSRs,
  • 0 settings bits 1 - 15 are reserved and must be zero. bit 0 indicated whether the interrupt stack MSR is used.
  • 1 interrupt stack if settings[0] is 1: when an interrupt occurs, SP is stored in user stack and SP is then loaded with interrupt stack before pushing PC.
  • 2 user stack receives the value of SP when an interrupt occurs and settings[0] is 1.

If interrupt stack is used, RETI and RETN are changed to do their thing with the interrupt flip flops, pop the return address off the stack, then load userstack into SP, then jump to the return address.
This thing is meant for making interrupts not screw with the stack the running program was using, enabling some optimizations that previously required interrupts to be disabled to leave them running.

I'm pretty sure MSRs could be used to create a place for MMU settings to be, so an MMU can be included in the z80 proper (not having to go through ports), but I'm too lazy to think of a nice way to do it.

As for cpuid, it conditionally sets the carry so you can check whether it's supported like this:
Code: [Select]
xor a
cpuid
jnc _not_supported
And to make things easy, the cpuid data indexed by zero is max cpuid index, indicating the maximum value you can ask cpuid for.

If cpuid is not supported, it corresponds to an undocumented NOP, which won't set the carry.

There should probably be a bit in the cpuid data indicating support for MSRs. Say, bit 0 of cpuid[A = 1].

153
2*3*3*37+0+0*3

33+3*3+0*2*7*0

Probably not the way you found because it's kind of .. constructed

154
Or to put it differently: stop friending everyone you see, it could be dangerous.

155
Computer Usage and Setup Help / Re: OpenGL framebuffer_object
« on: January 07, 2013, 03:26:22 pm »
Ok then I don't know, sorry. Maybe there really isn't a solution, if you already have the newest driver..

156
Computer Usage and Setup Help / Re: OpenGL framebuffer_object
« on: January 07, 2013, 06:52:13 am »
Ok this is just not right. Intel HD Graphics are supposed to support FBOs. Does this thingy help?

157
Computer Usage and Setup Help / Re: OpenGL framebuffer_object
« on: January 06, 2013, 12:30:32 pm »
Try this thing: OpenGL extension viewer
Use it to see if your renderer supports GL_ARB_framebuffer_object.

If it doesn't, well, wtf? Essentially everything supports it.. If it does, that game has a bug.

158
Art / Re: Extent of Copyright Laws with sprites
« on: December 23, 2012, 01:57:35 pm »
It's probably derivative work because it has been significantly transformed.

159
Computer Programming / Re: C# saving
« on: December 12, 2012, 01:32:39 pm »
Text files are hard. Not because they're hard per se, but using a text file means you'll need a to round-trip your data through a stringyfier and a parser and still be the same.

Binary files are much easier, you can use a BinaryWriter to just Write every field in every object (you may need to or want to add IDs to some objects so you can save references), use a BinaryReader and ReadSomeType to load (in the same way and order as you did the writing, of course), and your data is guaranteed to be intact. There will be no annoying things such as "loading a saved double loses accuracy", you won't have to decide on a syntax, and you won't have to write a parser.
Using automatic binary serialization has some of the same pros, but it has big cons: terribly slow, won't work across multiple versions, generates huge files, doesn't work with some build-in types anyway.

160
Math and Science / Pressure Checkerboarding
« on: November 23, 2012, 06:13:26 am »
I wrote a simple fluid simulator with a co-located grid for pressure (it seemed the easiest way to do it), and the thing I'm simulating can't use artificial velocity dampening (I'm simulating objects that move through air, it would be weird and improper if the air speed dropped towards one end of the screen). So I got pressure checkerboarding.

I "solved" that by sampling at 1.2 pixel offsets in all four directions (instead of the usual 1.0) in the jacobi iteration. That works because it samples into the pressure squares (of the checkerboard) and immediately smears them out, so they average out with their neighbours for the most part.

But what did I actually do in terms of physics? Does this violate any laws?

Also, slightly off-topic, there are some really bad effects caused by the fact that pixels are square - everything that isn't an axis-aligned line looks like a "stair" and it messes with the pressure. Is there some way I can easily simulate a smooth surface?

161
Introduce Yourself! / Re: Not really new, but
« on: November 22, 2012, 04:17:42 am »
Thanks for all the peanuts :)

162
Introduce Yourself! / Not really new, but
« on: November 21, 2012, 04:59:01 am »
I've been around, of course, but I'm planning to suddenly get way more active.

Why, you may ask? Because other places are getting increasingly unpleasant.

A common pattern is asking a question about performance only to get a dozen replies saying that performance is irrelevant and you shouldn't care about and "we're not going to answer you until you prove you actually need performance". Pretty much guilty until proven innocent.

Also, I never received that bag of peanuts, so this topic is totally necessary :)

163
Other / Re: Realistic Mecha Tech
« on: October 16, 2012, 05:25:57 am »
XOS looks cool yes. They use hydraulics, the only other realistic option (as far as I know). And they're having trouble making it self-powered, as I predicted, though going for the exoskeleton configuration makes that harder than it would be in a giant mecha.

edit: and I just thought of something. The large inner diameter of torque motors makes it possible to fit a simple robust planetary gear in there. For example, fix the sun gear to the upper leg, fix the carrier to the lower leg, drive the annulus with the torque motor. This simple arrangement can give you a 2:1 reduction (at best) in speed, and therefore a 2x increase in torque. What if we used a smaller torque motor with gears? Suppose we use three of them in a knee, same leg size, and go for a worse case angle again. Some math later, that gives (6 * 2700 newton meter) / ((9.81 m/s^2) * 1.2m) = 1.38 metric tons of holding weight, and at max torque it can "jump" with an acceleration very close to 1g. Is that good enough? I don't know. Let's put in the bigger model again, the 79.5cm one: if you use three, you get a holding weight of 3.24 metric tons. Now, I don't really know what the weight of this mecha will be, but that seems about enough - with plenty to spare for "carrying big things" even at that worst case angle. Or, put differently, if you put that specific knee (3 times the 79.5cm torque motor with a 2:1 reducing gears) in a 1ton mecha, you could make a jump of more than 5.7g. That's some jump. This isn't some slow lumbering giant, it's actually one of those crazy agile unrealistic mechas that you see in anime. Assuming, of course, that you can keep the weight at 1ton.

edit2: I made a quick rendering rendering that took a ton of work (it is in fact exactly to scale) of what a 16:9 (about as close as you can get to 2:1 with sturdy enough gears) gear system would look like

Now imagine a torque motor on the outside of that, the three small gears being connected to the inner framing of the lower leg, and the middle gear being connected to the inner framing of the upper leg. The large hole in the middle gear allows you to fit cables through an area where the cables only have to move minimally when the joint is turned. So far this seems to be working out well.

164
Other / Re: Realistic Mecha Tech
« on: October 15, 2012, 06:26:28 am »
Balance could be an issue yes.. I don't think it's really necessary to make the torso bendable in a complex way though, you don't really do that, you only bend your lower back (and I'm not sure you're even supposed to). A suitably complex hip should be fine, I think.

165
Other / Realistic Mecha Tech
« on: October 14, 2012, 04:52:06 pm »
With robotics being what it is today, why aren't there giant mecha's yet? So I started thinking, can we build them right now, with current tech?
And I haven't really encountered that much trouble, but then again, what do I know.

So here's a rough outline of my current idea, feel free to poke holes in it or, preferably, suggest something better:

Based on the following, I went for an "all-electric" approach:
- hydraulics: not too bad speed/strength, but angular range of motion of a joint is terrible. Acceptable precision.
- pneumatics: very fast, can have speed and precision trouble and the range of motion is still terrible.
- mechanical drive (connecting actuators to the main engine with pushrods or pulleys or whatever) is probably the worst option.
That leaves an option which doesn't seem to bad: electric torque motors everywhere. With a large radius, and active cooling (fluid-cooling for the high power parts such as knees, hip, elbow, shoulders, air cooling elsewhere). I did some calculating, and with current tech you can easily make your mecha strong enough with torque motors. I know, it's weird. But let's see:
Let's take one from this list, just to prove the point: http://www.etel.ch/torque_motors/TMB
A diameter of 80cm is acceptable for a 5 meter tall mecha (that sticks out only 40cm at each side of the joint, and that doesn't look as bad as you may think it does), so let's take the strongest one of those: 6.4KNm continuous, 11.2KNm peak. Let's put in the most demanding place, the knee, and assume a worst case angle of 0 degrees, and assume both leg sections have a length of 1.2 meter. That means your mecha can weigh about 550kg. WTF? Not good enough! No, but no one said you had to use just one. While it has a diameter of 80cm, it has a width of only 15cm. You can easily use three (or heck, why not four), giving you a max weight of 1600kg. That's more like it, and remember: you don't do this. This worst case corresponds to squatting on one leg. Why would you do that? You actually have two legs, and you're not even going to squat to zero degrees when preparing for a jump - and when you're jumping, you can turn the dials to eleven and use peak-torque, which is nearly twice the continuous max torque. Unless my physics suck (not entirely out of the question), this will work out just fine in terms of raw strength. You should be able to jump, run, climb and throw one hell of a punch.

Then there's the issue of power supply. Carrying around a generator and a flammable liquid works alright and is the usual solution, but if you look into Aluminium-Air batteries there is something to talk about there too. Alu-Air batteries have a lower energy density (9MJ/kg according to some sources) than, say, diesel fuel (45MJ/kg), but that's not the only factor. For Alu-Air, you don't have an engine and a generator, and that space is not insignificant - you'd need a pretty sturdy engine and generator. Also, being a battery, mechanical failure is unlikely unless moderately damaged, and it works great under non-constant load, which is what you'll have in a mecha - running, jumping, punching, etc? Such great bursts would require energy storage to smooth out if you went for a combustion motor + generator (unless you used pneumatics, which I chose not to).

Slightly into future-tech: both battery tech and electric motor tech are improving as we speak. These are active area's of research. If it works well enough now, imagine how well it will work a decade from now, or even just with the latest research models.

Now, the controls. They have been subject of much debate. Those "hands and feet only" style controls (joysticks and pedals, seriously?) don't offer nearly enough degrees of freedom - the entire idea is to have a tactical unit that can do whatever the heck, not just a "tank with legs". I think the best option here is keeping most of the body immobile and sensing the muscles of the pilot. That also lets you give the mecha more flexible joints than the pilot has - it senses which way you want to move a joint at what strength, not where you want to have it, so it could easily "go further" than a human joint. That's also no problem at all for a torque motor. It shouldn't be too hard to train for that. Taking such direct control also means you don't need very complex "walking algorithms" and systems to keep balance - the pilot should be able to do it himself, using his natural instincts, feeling for balance, and some training.
We're not quire there yet with reading input directly from the brain, but we're eerily close. That's the sort of tech no one knows exists - it does, but it hasn't, to my knowledge, been miniaturized yet, and the response times are bad.

That works great for legs and arms, but not so much for hands. There, you'll want tactile feedback, and that sort of tech already exists as well. Also, you'll need some way to control "other stuff" (turn night vision on and off, climate control, weapon systems, etc). So I propose that your left hand (or right, if you're a leftie) doesn't control a robotic hand but instead manipulates a touchpad. That makes the left arm of the mecha a good place for fixed weapons (no lasers yet, but a nice gatling gun should work), which may seem odd at first, but you'll get used to it, and you could still hold a weapon in your right hand. I'm not 100% satisfied with this approach though, so if you have a better idea, that would be great.

Sensors (vision etc): you'll want this to be good. I'm thinking stereoscopic projection on the eyes, with extra info overlaid (also the virtual buttons and stuff that your left hand manipulates, if you can't see them you can't find them), looks around when you turn your head. And you know the drill: can zoom well, can draw boxes around targets, has infrared mode, and generally tries to provide useful information without getting in the way. That already exists, too.

So, thoughts? Is this even on the right track?

Pages: 1 ... 9 10 [11] 12 13 ... 16