Creating a new scripting language on PHP

Jorge Castro
4 min readDec 25, 2018
beep beep

Why we need it?

Let’s say the next problem, we want to do the next operation:

$a2=20;
$result=$a2*20 ; // this operation could be dynamic, it could change.
//… later
$a2=5000;
// and here we want to calculate it.
echo $result;

So, we are just calculating it 20x20. $a2=5000 is ignored, the result is, as expected:

$result=40;

it is because the operation is done at runtime. However, what if you want to call and do the operation later or dynamically?. Then, let’s write the function as follow:

$a2=20;
$result=‘$a2*20’; // our script.
$a2=5000;
calculate($result); // and we calculate result with $a2=5000
$a2=6000;
calculate($result); // and we calculate result with $a2=6000

So, what PHP understand is the string ‘$a20*20’ (our script based in our new language), so later, we could do the operation whenever we want to.
We could use functional programming to do the same operation, but it is not trivial neither elegant. However, it is an alternative.
But there is more since we are creating a new language, then we could add some syntax sugar and restrictions.
What I learned from Laravel’s blade, is we don’t need to change things so much, what if $var is still a variable (alas PHP) and what if 20 is still a number, and “text” and ‘text’ are still a string. Now, it’s decided:

  • $var = it’s a global PHP variable.
  • 20 = its a number.
  • “text” = its a string.
  • ‘text’ = its a string too.

However, this language is not a full-fledged language. It is only a basic parser. Business decision: What we don’t want to do, it’s to fill a language with cryptic codes and definitions. Also, developers don’t need to spend time learning a new language, So, I decided:

  • field = it’s a field.

What is a field? A field is a inner variable (stored inside the language). So, while $field is a global PHP variable, field is a variable inside the new language. Specifically, field is equals to $array[$field]

  • function() = its a function

A function has the same name than a field but it’s followed by a “(“ symbol. So, “something” is a field, while “something()” is a function.

Parts of the language.

The syntax is splitted in two parts, where and set. Now, the language also looks like SQL. This language doesn’t need (yet) select,order, from or another part of SQL but where and set.

Where is conditional

where $field=20 and field2<>40 or field3=40 // sql syntax

But it also allows (why not?)

where $field==20 && field2!=40 || field3=+40 // PHP syntax

Where it is a condition, it returns a boolean, true if $field equals to 20 and field2 is not equaled to 40 or field3 is equaled 40. The SQL syntax is cleaner, but the PHP syntax is more natural for PHP developers. So the learning curve is short or clear. Pick one.

Set is for setting

set field=20,field2=50,field3+30

where, we are storing in field the value 20, field2 the value 50 and field3 increased the value by 30. The comma is used for separator. The language doesn’t allow field3+=30 but allows field3+30.

Some syntax sugar.

The language also allows some extra magic.
Let’s say that I want the month of a variable. We could call it as

month(variable) // php style of methods.

but the language also allows:

variable.month // Java and C# style of methods.

The language does the next comparison:

somefield.param

  • If the caller (of the library) has a function called “param”, then it calls $caller->param(somefield)
  • If the caller has a method called param, then it calls $caller->param
  • If caller is an array, then $caller[somefield]
  • Otherwise, it calls the method (of the service class) $serviceClass->param(somefield)

But, where the functions come from?

Excepting a set of function, every function is added dynamically by who-call the language (an object) or by a service class (defined by the language).

A special build-in-function is the function flip(), it is a function that inverts the value of a variable.

Restrictions

The language is limited to a set of commands. The language (for performance purpose) doesn’t allow parenthesis (but for function).

set a1=(20+30) //it’s not allowed

The language is only for logic and setting and nothing more. It doesn’t allow to define functions, but it could use functions. It lacks logic (if) because it is self-build. Also:

where a2=20+30 // not allowed
set a2=20+30 // allowed
where function(20)=20 // allowed
set function(20)=20 // allowed too
where function(20+30)=2 // not allowed
where function(20,30)=2 // allowed

Usage?

This new language is used for the next open source library:

It is a state machine that requires it. Example of usage:

$smachine->addTransition(STATE_PICK,STATE_CANCEL,'when instock = 0',"stop");
$smachine->addTransition(STATE_PICK,STATE_TRANSPORT,'when picked = 1',"change");
$smachine->addTransition(STATE_TRANSPORT,STATE_TODELIVER,'when addressnotfound = 0',"change");
$smachine->addTransition(STATE_TRANSPORT,STATE_HELP,'when addressnotfound = 1',"change");

The language uses only 600 lines of code and it is still scalable. It used the php function token_get_all() as parser. It is a bit of hacky but it works and it’s enough fast.

Now the language is also used for this library

It is a generator of fake data to fill database. Unlikely other alternatives, it is focused on create date based on a context or logic.

Example:

->gen(‘when _index<200 then idtable.value=parabola(50,2500,-1,1,1)’)
->gen(‘when _index<200 then idtable.value=randomprop(1,2,3,30,50,20)’)
->gen(‘when _index<200 then idtable.add=sin(0,0,10,30)’)
->gen(‘when _index<200 then idtable.value=sin(0,0,10,1)’)
->gen(‘when _index<200 then idtable.value=log(0,0,100)’)
->gen(‘when _index<200 then idtable.value=exp(0,0,10)’)
->gen(‘when _index<200 then idtable.value=ramp(0,100,10,100

Edit

Now, the language lives on it’s on library. The library follows the current ideology: slim (a single class), fast, and with minimum or no dependency. In this case, it only has dependency to PHPUnit and nothing more.

--

--