Omnimaga
General Discussion => Technology and Development => Computer Projects and Ideas => Topic started by: Ki1o on February 24, 2012, 04:49:33 pm
-
Many of you may not have noticed (or even cared for) my absence but I have been busy for the last couple of weeks. I recently got accepted to a computer science research academy at my local college (I'm in high school) so thats what I have been doing. I am creating an interpreted programming language in C++ using either BYACC, YACC or ANTLR for the parser and LEX or FLEX for the lexical analyzer. If you have any ideas or suggestions or tips feel free to help out. Thanks. :D
-
Can we have a Hello world in this fabulous new langage ;D ?
-
I'd like to see some basic syntax for this, and if I learn what/how to use the things you talked about I'd be glad to help. I'm kind of a noob in C++ so I might not be able to do much. I'm much better with python though I am learning C++
-
I'm with ruler in saying I'd like to see some syntax, but it sounds cool! I'm actually looking into compiler design myself, but not for direct machine code- I'm trying to use the .NET Framework as a base, but I'm going to create a parser, lexer, etc. all from scratch. I'm decent with C++, so if you need any help or anything feel free to ask.
Good luck!
-
Wow, sounds very promising, creating a programming language with C++. I personally tried to learn C++, but didn't get much further than half the SAMS Teach Yourself C++, and also looked at C++ for Dummies. That's pretty much it for me. I mainly used Codeblocks when typing C++, a great GUI type program that can bundle in all the complier and code editing in a great looking program (you have to install the compiler separately, but some installers come bundled with mingw)
Good luck with your project!
-
Right now we are working on learning C++ a little more. We are also working on learning BYACC, ANTLR, and LEX/FLEX. Afterwards we can define the grammar and incorporate basic arithmetic as well as variables, loops, and control flow.
Right now we have named it M4Trix.
-
Seems interesting. Make sure, however, to start with the basics first, though, then work your way into the language depths, like with Axe Parser. This is to make sure you don't start too huge and won't get overwhelmed by the size and complexity of such project. An example of what I mean: Download Axe Parser v0.0.1 then a few other versions and see what was present in it at that time, then the author Quigibo added more and more.
It's also good to not make the syntax too complicated but still quite readable and short.
-
Yeah we have taken it in another direction trying to see if we can create a hand written parser. We also have a basic syntax defined.
Example :
<<< is input
>>> is output
We have basic arithmetic tokens as well as comments.
Example program would be:
>>> "Hello World!"
-
Do you compile to some byte-code then interpret that, or do you interpret the source directly?
...I am creating an interpreted programming language in C++...
-
The source will be interpreted directly for difficulty purposes (Creating a virtual machine etc.). :P
EDIT: Meaning creating bytecode would be too hard.
-
I know what you meant :P lol. Do you store the information to some kind of buffer? I think that would be easier if you plan on doing loops, but then again I'm not the one taking the course lol.
-
The source will be interpreted directly for difficulty purposes (Creating a virtual machine etc.). :P
EDIT: Meaning creating bytecode would be too hard.
Creating bytecode is largely the same process as interpretation at the simplest level... If you want to apply optimizations to the bytecode, that's different and can be much more complicated, but just mapping language to bytecode should be relatively simple with a solid language/bytecode design. I've found it's a good sanity check for a language that it is easy to generate a bytecode based on code in that language. If it's not easy, then the language is too complex. Let's take a while loop as an example:
While(A>0){
B=B+1;
C=C+2;
};
Let's try parsing this to bytecode:
TEST_GREATER(A,0); // Tests if arg1 is greater than arg2 and returns 1 if it is, 0 otherwise.
IF_NOT_GOTO(ANS,+6); // If Not(ANS) is true, then GOTO the opcode 10 commands after this one (code after the while loop)
INC(B,1); // Increment B and store result in ANS
SETVAR(B,ANS); // Store the incremented value of B to the variable B, overwriting the old value
INC(C,2);
SETVAR(C,ANS);
GOTO(-6); // GOTO the opcode 6 opcodes before this one (The condition test)
As you can see, generating a bytecode is basically equivalent to making the interpreter understand the code at all. One reason why bytecodes are so often used is that it's a lot faster to parse the one or two bytes that make up a typical opcode than it is to parse a potentially multi-lined script over and over again, when the work really only needs to be done once.
If the above bytecode looks like Assembly, that's because that's exactly what you're generating! Assembly code for a virtual machine.
-
Sorta off topic: would that be for a stack based VM or register based? I've seen tutrorials on generating bytecode for a VM but we have a limited amount of time in which to get this done which is why we favored just directly interpreting it.
-
That bytecode would be more suitable for a register based VM, which is what I highly recommend unless you have a specific reason for choosing stack based (like portability to very memory limited systems).
-
Ok so after some discussion we've decided chage some of the syntax. Input and output will be marked with only one < and > respectively. We've also decided to use Java for the lexical analyser and the parser which will interpret the source code into Java bytecode.
The interpreter will be written in C++ fro speed purposes. Essentially our language will be quite portable.
EDIT: Also we're thinking about changing the name. Any suggestions?
EDIT2: OK my bad.... the whole thing will be implemented in Java and will be compiled to Java bytecode ;D
That's correct now
-
If > and < are input and output, what would the greater and less than signs be? << and >>? I guess that would work.
Naming suggestions... Idk
Nice project though! :D
-
@cyanophycean314, no it check to see if its the beginning of a line if so then it is read as input or output. If not then it is comparison operation.
-
Hello, I am working with Ki1o on the programming language, and I am posting to clarify our intentions for the programming language. We were eventually going to implement the lexer, parser, and interpreter in C++. However, we decided we wanted the language to be more portable, so we decided to implement the lexer and parser in Java and the interpreter in C++ (for performance reasons). Now we are just going to implement the whole project in Java and compile directly to Java bytecode. We plan to create a dynamically typed and easy to use (and eventually object oriented) programming language. We hope to reduce the verbosity of some other programming languages languages. Our reasoning for implementing the whole project in Java is that Java is easier to use (in my opinion) than C++, more portable, and more secure. We are still deciding on a name and would love any suggestions.
Just to show the simplicity of this language, Ki1o has already posted the methods for input/output.
< "Output"; // output
> variable; // input
< "You entered " + variable; // Output expression
@cyanophycean314 you raise a good point and the way we will differentiate between these and the greater than and less than operators is by the context. If > or < is used by itself on a line, it implies input/output, if used as an operator (2 < 3) then it will be treated as a greater or less than operator.
-
Ok, that works too. Once again, keep up the good work! :D
Edit: :ninja:
-
Sorry I didn't reply before, I didn't get an email for some reason.
As for names, I prefer colors for some reason lol. I have a sort of VM I'm writing called Red, and I was thinking about changing my .NET language to Blue. I also think things dealing physics have cool names, like quasars and novas.
Now for bytecode, do you mean actual Java bytecode, or your own implementation of it?
-
Java bytecode.
-
So literally the same byte structure as a .class file?
-
@BlakPilar Yes, the bytecode structure of a .class file following the Oracle JVM spec: http://docs.oracle.com/javase/specs/
-
So essentially programs written in your language will be able to be run on any computer with a Java VM on it. Very nice. Are you going to support classes, enums, methods, etc.?
-
Yes, it will be able to run on any computer with a JVM. We are still learning, so we are going to start with basic features such as input, output, variables, and arithmetic. We will later add support for functions, if/else if/else statements, loops, arrays, and eventually classes and objects.
-
Yes, right now we are studying the JVM specs so we can interpret and compile the instrctions.