Author Topic: Antelope (formerly "OPIA") - A Polymorphic z80 language (Read 46845 times)

shkaboinka · « **Reply #90 on:** March 25, 2012, 09:29:41 pm »

Quote from: NanoWar on March 25, 2012, 07:21:42 am

Multi arrays with [] seems like a must. Instead of tuples, can't you just return an array?

Multidimensional arrays are awesome

...and you could return an array if you really liked...

byte (x,y) = foo(1,2);
func foo(byte a, b): (byte,byte) {
   return (m,n);
}

[2]byte xy = bar([2]byte{1,2});
func foo([2]byte ab): [2]byte {
  return [2]{m,n};
}

[1]byte z = boo([1]byte{1});
func boo([1]byte c) {
   return [1]byte{m};
}

...but I think tuples are a worthwhile feature

(and yes, they are loose enough to be equivalent to a list of function arguments).

Quote from: DJ_O on March 25, 2012, 12:32:50 pm

By the way, even if you make the language similar to standard languages, when this comes out, you should really start writing a good tutorial for it. Preferably make one for people with no coding experience, so that for BASIC coders (the ones usually looking for alternatives) can easily pick up OPIA.

I try to design everything in the best fashion regardless of other languages; though when I use a feature that other languages have, I try to reflect forms that are common if I can; but sometimes I find "better" forms (the cofunc combines what other languages present as closures and generators/iterators). ... But I definitely plan on focusing on other details after the language and compiler are designed (documentation, tutorials, GUI tools, defining standard libraries. NOTE: These are the things that people can be VERY involved in later on!!)

shkaboinka · « **Reply #91 on:** March 26, 2012, 09:52:31 am »

Ok, here are some bottom lines for arrays that I want to stick with (you can forget what I said so far if this makes it simpler):

(1) The first part of any array (e.g. the [X] part of [X][Y][Z]T) will be static if all of it's dimensions are given; otherwise it will be an array to a pointer:

Code: [Select]

   [5][Y][Z]T // Static array of 5 [Y][Z]T values
   [ ][Y][Z]T // Pointer to array of (some number N of) [Y][Z]T values
 [5,5][Y][Z]T // Static array of 5x5 [Y][Z]T values
 [ ,5][Y][Z]T // Pointer to an array of Nx5 [Y][Z]T values
  *[5][Y][Z]T // Pointer to (because of *) array of 5 [Y][Z]T values
  *[ ][Y][Z]T // Pointer to pointer to array of N [Y][Z]T values

(2) Any pattern of [X][Y][Z] should create a jagged array and [X,Y,Z] should create a rectangular array. This is done by making all "inner" [...] values be pointers -- REGARDLESS of what dimensions are given:

Code: [Select]

   [5][ ]T // Array of 5 (pointer to array of T) values
   [5][5]T // Array of 5 (pointer to array of 5 T) values

(3) "Inner dimensions" (e.g. the x's in [,x]T or [,x,x]T) must be given explicitly (as numbers). I'd LIKE to allow types like [,]T, but it just is not feasible without either providing a clunky mechanism or adding assumptive overhead (which will fail to match native system structures). However, this CAN be done manually as in (4):

(4) Multidimensional arrays (not jagged arrays) can be treated as one dimensional arrays (since that is how they are actually stored). For completeness, I could allow other conversions as well (that is, if you can go from MxN to N, you should be able to go from N to MxN; thus if you can go from LxMxN to N to MxN, you might as well go from LxMxN to MxN, etc.):

Code: [Select]

   [L,M,N]T arr; // An LxMxN array of T values
   [ ,M,N]T p3 = arr; // Pointer to an ?xMxN array (3D)
   [   ,N]T p2 = arr; // Pointer to an ?xN array (2D)
   [     ]T p1 = arr; // Pointer to an (N) array (1D)
   p3[x,y,z] == p2[x*M+y, z] == p1[(x*M+y)*N+z] == arr[x,y,z]

shkaboinka · « **Reply #92 on:** April 04, 2012, 12:25:24 pm »

I have updated the Overview to reflect my changes for arrays, tuples, functions, and default arguments.

As for interpreted stuff ($), I think that I want it to always be "deep" (recursive). That is, "$foo(...)" will cause ALL of foo to be interpreted, and "$while(...) { ... }" would cause everything in the loop to be interpreted (including inner constructs). This means that all variables declared within must be determinable, but external variables referenced may be affected at runtime.

The reasons for choosing a "deep" evaluation (versus just that "layer") are that (1) it's easier to mark a section, and (2) this is the more likely use anyway. I'd rather let single layers be optimized by the compiler rather than see people put $'s all over the place where they think that an optimization can occur ... but I do like to allow "Hey, interpret that whole thing ... I just didn't want to compute the values and embed them all by hand".

One other thing: I think I want to replace "const" values with a "final" modifier, on the grounds that a "final" value is actually embedded in the program as an (immutable) variable (e.g. embed "BIG_UGLY_STRING" as a final value, rather than once for EACH time it is used) -- as in Java. As for value-holders, you can use $ to indicate this anyway. How does that sound?

shkaboinka · « **Reply #93 on:** April 07, 2012, 09:55:10 pm »

THE TIME HAS COME! (the walrus said) ... I have reviewed all comments, topics, posts, etc. all the way back until before I switched the language to a "Go-ish" layout, and I finally feel that OPIA is fully (syntactically and semantically) defined and decided (or at least, as much as it can be before it is coded) -- which I wanted to do before I did anything huge with the compiler.

My plans (when time permits between school, work, moving, and my soon to be first-child) are as follows:

(1) lay out a careful plan for the compiler pipeline/design. This has already been 90% grasped in my head alone (after having poured over compiler books and articles and mental experiments "for fun"), but I have recently re-read most of the chapters in Modern Compiler Implementation In Java (it's the BEST!), and been charting out a careful comparison of my design versus what everyone else swears by -- and it turns out that I am not deviating terribly much. Expect an analysis and layout of the plan from me in the future (I have it mostly set).

(2) Update the Language Overview as much as I see fit (I don't want to be too picky with it, but I do want to make sure that every aspect is well documented so nothing falls through the cracks).

(3) Jump into the coding. I will keep everyone updated as that progresses. The exciting thing is that I will release it in modules, so that you can see everything up to the tokenizing & preprocessing (this is it's current state, though I may rework that a bit to be SLIGHTLY more modular), parsing/tree-building, etc. Every aspect of the pipeline will be explained, as in (1).

Anyone is welcome to ask questions about anything they think is lacking or confusing, and I am still open to considering some changes/additions; but I don't foresee anything that would change the language enough to put of coding it now.

Side note: as for interpreted aspects, the default will be to precompute as much as possible without unraveling loops or recursive calls (unless the contents clearly have no runtime side-effects), and that the $ operator will be "deep"/recursive, causing a thing (and ALL of it's contents, except for references to externally defined entities) to be fully precomputed.

DJ Omnimaga · « **Reply #94 on:** April 07, 2012, 10:44:36 pm »

Question, will the language allow people to be a bit loose on the syntax or will it be extremely picky, like when you forget a ; in some languages or more like when you forget to close a parhentesis in TI-83+ BASIC?

shkaboinka · « **Reply #95 on:** April 08, 2012, 01:39:38 am »

Well, you can put as many spaces and tabs where-ever you want

Sorry, TI-BASIC is one of very few languages I know of which is that "loose"; though I might consider whether or not I really need to require semicolons -- there were places where they were necessary, but I'll look at it again. I am going to have it give very nice error messages (specific, location, and give lots of error messages rather than just quit).

DJ Omnimaga · « **Reply #96 on:** April 08, 2012, 07:15:45 pm »

Ok. I was just wondering, because it's not a good idea if a language is too loose, since it makes it harder to find errors, as the coder gets weird errors for absolute no reason. However there are some languages, not necessarily programming, that were just way too picky, such as PHP, where one single unnoticeable typo would always cause a syntax error or something.

shkaboinka · « **Reply #97 on:** April 12, 2012, 05:08:26 pm »

Some final notes before I begin in (May or June?). THESE ARE DECIDED (tentatively):

The new numeric datatypes are byte, sbyte, int, and uint.

String literals can take on different forms:
::: "null terminated string"
::: b"byte-prefixed string"
::: i"int-prefixed string"
::: r"raw string"

Numeric literals are of type "int" by default. Type-casts can be used to be explicit: (byte)5;
Hexadecimal and Binary literals are prefixes with 0x and 0b, respectively, and default to type "byte" or "uint" types according to the number of digits used (so as to reflect a literal bit representation).

Dereferencing is now on the right-side:
::: *[]byte pa; (*pa)[idx];
::: []*byte ap; *(ap[idx]); // or just *ap[idx]

Constant (immutable) variables are defined with "const" in addition to the datatype.
Constant expressions (pasted in like macros) are defined with "const" alone.
It is illegal to mix "const" or "volatile" with the ":=" operator.

There are standard-, BASIC-, and iterator-style for-loops:
::: for(init; condition; update)
::: for(var: start, end, inc) (inc is optional)
::: for(var: array); for(var: someYieldyCofunc)

The inferred type of an array literal works as follows:
::: []T{...} = A static array of the given values
::: &[]T{...} = Pointer to the given static array
::: new [n]T = Pointer to a new (uninitialized) array allocation
::: new []T{...} = Pointer to a new array allocation (values copied from static array)

Arrays with the (first) dimension omitted ([]T or [,n]T) are pointers.
Arrays of the form [ , ,...] are rectangular (stored in one static allocation).
Arrays of the form [][][]... are jagged (the "inner" arrays are stored as pointers).
Arr[3..6] is a shorthand for the tuple expression (Arr[3], Arr[4], Arr[5]).
Tuples will remain "auto-unpacked", and no tuple-variables allowed.

Method receivers ("this") are ALWAYS (intrinsic) pointers.

Methods may be defined within structs, just as in Java/C# (compact/familiar).

There will be an "x@value" syntax for embedding variables within array literals.

Addressing goes on the LHS and applies to the whole RHS (e.g. &a[n] is &(a[n])).

Entities may not be defined within each other (e.g. no structs within structs, etc.).

Expressions cannot contain statements (declarations, assignments, calls, var++/var--).

Anonymous functions may may not refer to (non- static/const) external local variables.

All code is precomputed as much as possible (without unrolling loops or recursive calls).

The $ operator Requires something to be interpreted, including loops and recursive calls.

Bridge methods will be inserted for multiple "inheritance" of anonymous fields, as needed.

Control-flow constructs will indeed require parenthesis (avoids parsing conflicts with literals).

No "static" members of anything (but static local vars and static initialization blocks are allowed).

Entites Have Global Accessibility If They Are Capitalized, and namespace accessibility otherwise.

Look-Up-Tables (rather than Jump Tables) will be used with switches and if-else chains as possible.

Methods can only be defined for "identifier" types (structs, primitives, etc., but not funcs, arrays, etc.).

Namespaces may be nested ("Outer.Inner" syntax), and there will be a "using Namespace" mechanism.

Self-Modifying code will be used with cofuncs and switch-variables (Will consider an option to disable it).

Explicit variable addresses can be nominal (@"address") or refer to another variable (@x or @arr[n].foo).

No exception-handling or "try-catch" mechanism (use multiple return values or create an "Exit()" instead).

Type-casts will be represented traditionally (e.g. "(byte)(a+b)").

"Extra" Parenthesis are not allowed within datatypes ("func(...)" requires them, but []*T are *[]T are unambiguous).

Function pointers without any return values may point to functions with return values (e.g. func(byte) pointing to func(byte):byte).

Values will be passed/returned in registers such that any two functions with the same pattern of arguments will use the same registers for them.

Default arguments (and struct members) must come last, and will be embedded in functions so they can be pointed-to as their reduced versions.

An anonymous (nameless) struct/interface/cofunc/func within a namespace will take on the name of the namespace (e.g. "List myList" rather than "ListNameSpace.ListStruct myList"). This also gives namespace values ("List.staticValue") the feel of Java/C#'s static class members.

aeTIos · « **Reply #98 on:** April 12, 2012, 09:01:42 pm »

woah at first i was like tldr but the content is pretty impressive

Really waiting for this, pity that you cannot work on it till may.

shkaboinka · « **Reply #99 on:** April 16, 2012, 03:40:19 pm »

I just want to note that I've been updating/revising that list in my previous post (as well as cleaning it up and making it MORE READABLE) ... Any opinions?

shkaboinka · « **Reply #100 on:** April 21, 2012, 03:56:59 pm »

I am considering the possibility of having OPIA build for various platforms (z80, 68k, whatever the NSpire and CSX are), so I need to design with that option in mind (your input would greatly help; keep reading). This most directly affects datatypes:

One option would be to use the standard byte, short, int, long types (8, 16, 32, 64 bits). The up side is that type-sizes would be consistent across platforms. The down side is that it would feel goofy using "shorts" (and "bytes") for z80, and using other types from other platforms.

The option I'm considering is to just use "byte" and "int", with "int" being whatever the "word size" is. Type sizes would not be consistent, but each processor is probably best suited to work with it's word-sized ints anyway.

I don't know if I will actually make it for more than just the z80, but I do want to design so that this could happen easily and without affecting how the language is defined. IT WOULD HELP TREMENDOUSLY IF ANYONE COULD TELL ME WHAT SIZES OF VALUES THE DIFFERENT PROCESSORS WORK WITH, AND THE LIMITATIONS OF EACH (68k, nspire, casio) (e.g. the z80 works best with 8-bit values, but has a 16-bit word size and can work with them as well; though some operations require multiple instructions). ... I just want to be able to make informed decisions

aeTIos · « **Reply #101 on:** April 25, 2012, 06:47:12 am »

You know the stuff for teh z80 no?

jsj795 · « **Reply #102 on:** April 25, 2012, 07:25:58 am »

sadly I'm pretty clueless when it comes to these stuff

But I do plan on learning C++ and when I do, I can probably code in OPIA since it looks really similar!

Anyways, great job so far

shkaboinka · « **Reply #103 on:** May 09, 2012, 08:37:32 pm »

Ok, how is this setup for datatypes and literals?

byte, char, bool: unsigned 8-bit values
sbyte: signed 8-bit values
int: signed 16-bit values
uint: unsigned 16-bit values

Numeric Literals would be resolved (by default) as follows:
55: int
(55i: int)
55u: uint
55b: byte
55s: sbyte

Hexadecimal and Binary literals would depend on the number of digits for how it resolves (by default), but explicit indicators can be used as well:
0x1: byte (1 or 2 digits)
0x001: uint (3 or 4 digits)
0x001b: byte (b indicator)
0x1i: int (i indicator)

0b1: byte (8 or less digits)
0b000000001: uint (8+ digits)
0b1u uint (u indicator)

...Note that those are DEFAULT evaluations (e.g. "X := 5" makes X an int); but a clear context may result in a different type (e.g. perhaps the compiler can tell that looping from 1 to 10 only needs a byte).

For string literals:
"null terminated string"
r"raw string literal"
b"byte-prefixed string"
u"uint-prefixed string"

... How does that all sound?

BlakPilar · « **Reply #104 on:** May 09, 2012, 08:54:49 pm »

By byte- and uint-prefixed strings, do you mean that number would be the length of the string?

Author Topic: Antelope (formerly "OPIA") - A Polymorphic z80 language (Read 46845 times)

shkaboinka

Re: OPIA - A full OOP language for z80

shkaboinka

Re: OPIA - A full OOP language for z80

shkaboinka

Re: OPIA - A full OOP language for z80

shkaboinka

Re: OPIA - A full OOP language for z80

DJ Omnimaga

Re: OPIA - A full OOP language for z80

shkaboinka

Re: OPIA - A full OOP language for z80

DJ Omnimaga

Re: OPIA - A full OOP language for z80

shkaboinka

Re: OPIA - A full OOP language for z80

aeTIos

Re: OPIA - A full OOP language for z80

shkaboinka

Re: OPIA - A full OOP language for z80

shkaboinka

Re: OPIA - A full OOP language for z80

aeTIos

Re: OPIA - A full OOP language for z80

jsj795

Re: OPIA - A full OOP language for z80

shkaboinka

Re: OPIA - A full OOP language for z80

BlakPilar

Re: OPIA - A full OOP language for z80