I finally finished the mzscheme backend. As I said before, I should have finished it before writing the last week’s status report. However, I’m still pleased to see the result though a little late than scheduled. So I’ll write a follow up of the last status report.
With the experience of implementing the guile backend. It should be not too hard to write yet another backend. It is true for most parts. However, there are many differences between different Scheme implementations. Here are some examples:
Scheme_Type is necessary in mzscheme.`(,,(+ 1 2)) will raise unbound variable error when evaluated in guile, but syntax error will be raised when compiling it in mzscheme. Different exceptions are raised from different stages. It’s really hard to uniform those behaviors.SchemeError (this is the ancestor of all Scheme exceptions) will be raised when compile and evaluate the code.mzscheme with apt-get, I didn’t find any 3m related files/libraries. I then assumed that the new model hasn’t been widely used yet (this is approved to be false, I’ll cover this later). So I decide to use the CGC, which seemed to be a little easier to use.ctypes.scheme_register_extension_global carefully, this kind of problem wouldn’t happen.If you look at those examples above carefully, you’ll notice they all have a common property: one is dealing with two similar systems with C (which is very much different) as the bridge:
However, if everything is trivial, we’ll be left with nothing to do.
And finally there’s the mzscheme backend. The benchmarks results is very good although I think I used a slow way of handling exceptions (I didn’t include the pyscheme backend, because it takes too long to run):
pluskid@benchmark $ python bmloadvm.py
Time to load the VM ------------------------------------------------------------
user system real total
guile 0.040000 0.050000 0.120000 0.210000
mzscheme 0.030000 0.050000 0.250000 0.330000
[19721 refs]
pluskid@benchmark $ python bmtailloop.py
Tail call performance ----------------------------------------------------------
user system real total
guile 0.006000 0.002000 0.007000 0.015000
mzscheme 0.002000 0.000000 0.002000 0.004000
[19732 refs]
pluskid@benchmark $ python bmtak.py
tak benchmark ------------------------------------------------------------------
user system real total
guile 11.996667 0.026667 12.110000 24.133333
mzscheme 2.933333 0.010000 2.953333 5.896667
[19733 refs]
pluskid@benchmark $ python bm_strcat.py
string-append and substring performance ----------------------------------------
user system real total
guile 0.012000 0.026000 0.038000 0.076000
mzscheme 0.076000 0.026000 0.103000 0.205000
[19732 refs]Besides, mzscheme support compiling. I’m planning to write some benchmarks to test the performance improvements brought by this feature. Generally, here’s my current tasks (I’ll try to keep several hard tasks simultaneously, if one of them stalled due to some difficulties, I can pick up another):
timeit module.Although sometimes it is frustrating to see segfaults all over the screen, the whole process has been interesting, especially when you finally find the way to solve the problem. I’ve also learned a lot from this.
One example is the altitude to the test cases. I always admire TDD/BDD, like to write tests, and especially glad to see all the test cases passing without problems. But I’m a little sensitive about the non-passing test cases. When I see some non-passing test cases, I have really strong will to make them pass. But sometimes I will change (or even remove) the test case merely to make it pass.
E.g. the test case of (= 1) is asserted to return #t. But exception is raised complaining wrong number of arguments are provided in mzscheme. There are different behaviors in different implementations. It is almost impossible to modify the native implementation to fix it (even if your patched get accepted, it would takes too long to let the new version with your patch become widely used), so I decided this is undefined behavior and removed the test case.
But I was wrong, even though it is undefined behavior, the test case should still be there. It is generally OK to have non-passing test cases. And the goal of a test case is not to pass, but to broke your code. Because 1000 (or even more) test cases passed won’t prove your program right, but 1 test case failed is enough to tell you the problem of your program.
Another important things I’ve learned from this (and from the whole open source community) is that source code is always good document. Sometimes it will be more clear to look at the source code directly than reading a bunch of long manuals. And open source makes it possible!
Here’s the snippets I found in the mzscheme source code to check whether the parameter is a valid list:
int schemelistp(SchemeObject *o)
{
SchemeObject obj1, *obj2;
obj1 = obj2 = o;
do
{
if (SCHEME_NULLP(obj1))
return 1;
if (!SCHEME_PAIRP(obj1))
return 0;
obj1 = SCHEME_CDR(obj1);
if (SCHEME_NULLP(obj1))
return 1;
if (!SCHEME_PAIRP(obj1))
return 0;
obj1 = SCHEME_CDR(obj1);
obj2 = SCHEME_CDR(obj2);
}
while (NOT_SAME_OBJ(obj1, obj2));
return 0; / circular list */
}It detects circular list cleverly while still keeps the common case (without circular list) fast.