Author Topic: tok8x: a very simple on-computer tokeniser/detokeniser  (Read 8616 times)

0 Members and 1 Guest are viewing this topic.

Offline shmibs

  • しらす丼
  • Administrator
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2132
  • Rating: +281/-3
  • try to be ok, ok?
    • View Profile
    • shmibbles.me
tok8x: a very simple on-computer tokeniser/detokeniser
« on: April 06, 2013, 10:48:08 am »
things like kerm's Source Coder 2 or merth's TokenIDE are all well and good for converting an 8x program between plaintext and link formats, but there are a few limitations to both: namely that they require graphical environments and cannot be scripted. to achieve these two purposes, i've started writing a tokeniser/detokeniser of my own (github). it's already functional, with the only remaining tasks being adding in full token sets (right now it only contains a very small subset of the axe tokens which i was using for testing purposes), implementing the option to skip comment lines, optionally writing to stdout, and writing the detokeniser (which is much simpler than its counterpart). have a screenshit!
« Last Edit: April 06, 2013, 10:56:23 am by shmibs »

Offline Lionel Debroux

  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2135
  • Rating: +290/-45
    • View Profile
    • TI-Chess Team
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #1 on: April 06, 2013, 11:28:55 am »
I've just relayed the information about your project over at TI-Planet: http://tiplanet.org/forum/viewtopic.php?f=10&t=11529

Nitpick about the code shown in the screenshot: you should declare "const" your token lists (and probably even the character string in the token struct), so that they're placed to .rodata, which can help catching bugs in some circumstances :)
« Last Edit: April 06, 2013, 11:29:10 am by Lionel Debroux »
Member of the TI-Chess Team.
Co-maintainer of GCC4TI (GCC4TI online documentation), TILP and TIEmu.
Co-admin of TI-Planet.

Offline mdr1

  • LV6 Super Member (Next: 500)
  • ******
  • Posts: 303
  • Rating: +21/-2
    • View Profile
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #2 on: April 06, 2013, 11:31:47 am »
Great ! Will you add some directives like #include, #define etc. ?



Offline Sorunome

  • Fox Fox Fox Fox Fox Fox Fox!
  • Support Staff
  • LV13 Extreme Addict (Next: 9001)
  • *************
  • Posts: 7920
  • Rating: +374/-13
  • Derpy Hooves
    • View Profile
    • My website! (You might lose the game)
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #3 on: April 06, 2013, 11:38:30 am »
Nice project!
And yeah, a pre-prozessor would be fun :D

THE GAME
Also, check out my website
If OmnomIRC is screwed up, blame me!
Click here to give me an internet!

Offline shmibs

  • しらす丼
  • Administrator
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2132
  • Rating: +281/-3
  • try to be ok, ok?
    • View Profile
    • shmibbles.me
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #4 on: April 06, 2013, 02:01:02 pm »
what sorts of things would you want from a preprocessor (besides define, which might be doable)?

and thanks, lionel =)

Offline Lionel Debroux

  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2135
  • Rating: +290/-45
    • View Profile
    • TI-Chess Team
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #5 on: April 06, 2013, 02:47:54 pm »
You're welcome :)

Together, #defines (even fairly simple ones) and #includes can have the same positive effect on maintainability as they have in C/C++ or, say, LaTeX. Splitting a program / document across multiple files, having parametrized values repeated (and simplified) multiple times in a program so that they can be changed at a single place.
Member of the TI-Chess Team.
Co-maintainer of GCC4TI (GCC4TI online documentation), TILP and TIEmu.
Co-admin of TI-Planet.

Offline Sorunome

  • Fox Fox Fox Fox Fox Fox Fox!
  • Support Staff
  • LV13 Extreme Addict (Next: 9001)
  • *************
  • Posts: 7920
  • Rating: +374/-13
  • Derpy Hooves
    • View Profile
    • My website! (You might lose the game)
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #6 on: April 06, 2013, 02:49:57 pm »
#if calculator=84+ then (89? nspire?)
<code>
#else
<code>
#end

just a thought :)

THE GAME
Also, check out my website
If OmnomIRC is screwed up, blame me!
Click here to give me an internet!

Offline shmibs

  • しらす丼
  • Administrator
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2132
  • Rating: +281/-3
  • try to be ok, ok?
    • View Profile
    • shmibbles.me
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #7 on: April 06, 2013, 03:21:33 pm »
You're welcome :)

Together, #defines (even fairly simple ones) and #includes can have the same positive effect on maintainability as they have in C/C++ or, say, LaTeX. Splitting a program / document across multiple files, having parametrized values repeated (and simplified) multiple times in a program so that they can be changed at a single place.

hokay, those both seem like they'll be easy enough to manage. i'll add them on to the end of the todo list.

#if calculator=84+ then (89? nspire?)
<code>
#else
<code>
#end

just a thought :)

this is only for the 8x series, though. EDIT: hmm, #if (compile option = blah ) would be really useful for debugging purposes, though...

speaking of which, what changes have there been to the token set for the 84+SEC? are there any new 2-byte regions?

EDIT2: xeda and runer? i don't really trust myself to make sure all the grammer and axe tokens are defined correctly, so, once i write those, do you think you could take a look at them and make sure i'm not missing anything important?
« Last Edit: April 06, 2013, 03:36:26 pm by shmibs »

Offline Lionel Debroux

  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2135
  • Rating: +290/-45
    • View Profile
    • TI-Chess Team
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #8 on: April 06, 2013, 03:37:03 pm »
Quote
this is only for the 8x series
If your tokenizer is modular enough, it could (assuming someone spends time on it, while there are few programmers for that platform anymore...) be used for TI-68k/AMS programs as well. The token lists for AMS are well known, and haven't evolved since 2005 (89T / V200 AMS 3.10).
As far as the Nspire is concerned... I performed enough reverse-engineering on the OS in 2011 to show that Nspire BASIC programs are still based on tokens similar to the AMS BASIC ones (and in general, the Nspire's CAS still has its foundation in the 92 CAS from 1996), but there are lots of undocumented things on the Nspire, and third parties are not making Nspire BASIC programs on the computer side.

There are, indeed, some new tokens for the 84+CSE, but I'm not the most knowledgeable about them - ask Kerm, BrandonW, Benjamin Moody or several others :)
Member of the TI-Chess Team.
Co-maintainer of GCC4TI (GCC4TI online documentation), TILP and TIEmu.
Co-admin of TI-Planet.

Offline Adriweb

  • Editor
  • LV10 31337 u53r (Next: 2000)
  • **********
  • Posts: 1708
  • Rating: +229/-17
    • View Profile
    • TI-Planet.org
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #9 on: April 06, 2013, 07:12:35 pm »
third parties are not making Nspire BASIC programs on the computer side.
Hmm, you mean other than using TINCS ?

Because using the software is by far the most efficient way to create basic programs.
« Last Edit: April 06, 2013, 07:12:51 pm by adriweb »
My calculator programs
TI-Planet.org co-admin.
TI-Nspire Lua programming : Tutorials  |  API Documentation

Offline DJ Omnimaga

  • Clacualters are teh gr33t
  • CoT Emeritus
  • LV15 Omnimagician (Next: --)
  • *
  • Posts: 55941
  • Rating: +3154/-232
  • CodeWalrus founder & retired Omnimaga founder
    • View Profile
    • Dream of Omnimaga Music
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #10 on: April 07, 2013, 01:04:18 am »
Will this support XML files like Tokens (and even the same format)?

Offline shmibs

  • しらす丼
  • Administrator
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2132
  • Rating: +281/-3
  • try to be ok, ok?
    • View Profile
    • shmibbles.me
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #11 on: April 07, 2013, 01:12:44 am »
there isn't really any need for xml files. anyone can just poke me with a new set to add to the source (or do it themselves) and compile it in. it's not that much of an overhead, size-wise. i just finished adding in all the Axe tokens, for example, and it only increased the executable's size from 14.8kb to 19.9kb
« Last Edit: April 07, 2013, 01:15:07 am by shmibs »

Offline DJ Omnimaga

  • Clacualters are teh gr33t
  • CoT Emeritus
  • LV15 Omnimagician (Next: --)
  • *
  • Posts: 55941
  • Rating: +3154/-232
  • CodeWalrus founder & retired Omnimaga founder
    • View Profile
    • Dream of Omnimaga Music
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #12 on: April 08, 2013, 01:25:13 am »
Ah ok I was mostly wondering since new commands could be added in apps like Grammer/Axe/etc, and cross-compatibility with TokenIDE XML files would prevent your app from falling behind if you ever got too busy to maintain it.

Offline shmibs

  • しらす丼
  • Administrator
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2132
  • Rating: +281/-3
  • try to be ok, ok?
    • View Profile
    • shmibbles.me
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #13 on: April 08, 2013, 04:40:38 am »
i just changed the storage format for other libraries so that they only need to list tokens that are changed from the main token set (like >Char instead of >Frac), so that should make maintenance a much simpler matter.

also, here's a little thing i thought people might find useful:

Offline merthsoft

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 241
  • Rating: +63/-1
    • View Profile
Re: tok8x: a very simple on-computer tokeniser/detokeniser
« Reply #14 on: April 08, 2013, 04:41:06 pm »
This is pretty neat. It's similar to what elfprince is doing, it seems. I've thought about just having a simple executable that doesn't require the editor and everything (it would be fairly simple), but no one expressed interest in it so I haven't done it--yours is probably better for that anyway since it's in C and therefore requires fewer dependencies.

One suggestion I would have it to make it so it can take TokenIDE-style XML files and use those for tokenization/detokenization. I see that you've mentioned that, but your solution of "anyone can just poke me with a new set to add to the source (or do it themselves) and compile it in" isn't really idea. Why make users recompile when they can just drop in an XML file? It also has the added bonus of making it so if someone makes a new token set for TokenIDE, it'll automatically work with yours and vice-versa. Standardization is, I think, a good thing.

There are, indeed, some new tokens for the 84+CSE, but I'm not the most knowledgeable about them - ask Kerm, BrandonW, Benjamin Moody or several others :)
Or Merth ;). The latest release of TokenIDE has the 84+CSE XML file with all the new/renamed tokens:
http://merthsoft.com/Tokens.zip
(Hopefully you don't think I'm trying to advertise, that's just where the xml file is, and you can use that for the new tokens.)
« Last Edit: April 08, 2013, 04:44:03 pm by merthsoft »
Shaun