-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syntax improvements #57
Comments
`var appears to only allow literal/defalt assignment``
TLDR: Also not trivial to fix
`In a switch statement I cannot declare a local variable.``
If I didn't answer a part of your question please let me know and ofcourse Generally consider that zephir still is alpha, so it shouldn't |
Thanks, these answers help.
If that is not what zephir is for then its purpose is a mystery :D.
I agree. I'm using this to create an initial extension that I will then be checked and modified by hand. I'm a bit worried when I see all the zephir* functions/macros. I've tried PHP-CPP for this and phpc but with out success for various reasons. Basically, I can write PHP extensions but it is much much slower than writing in PHP especially when the PHP to be converted is particularly complex. Having something that lays down an initial codebase that can be optimised/tuned is the best option for me. However, as proof on concept first it would be nice if it could actually parse.
Anything PHP related with anything safety critical is a laugh. I'm not really thinking of the standard PHP references mechanism but how the array variables are handled internally in the traditional C style (actually leveraging C pointers transparently to the higher level code). Perhaps that is more complex because it requires a fundamental rethink of the library. At the very least if zephir could support array access in the same way as PHP that would at least keep it on par. Minor annoyances aside it looks like everything else works fine with zephir except this case which is a blocker and makes porting code no longer viable.
If you point me at where that is handled in the code maybe I can look at it. The way PHP manages arrays by default is simply flawed. They are passed by reference for read but copied on a write. This isn't about microoptimisation but about when you have a task that needs to do heavy and complex work with arrays. Those tasks are not invalid. The issue is that PHP is lopsided in it's underlying implementation. Passing arrays as COW references is a legacy hangup. COW is great when you actually want to make a copy to be updated somewhere that will be JIT copied when it reaches its destination or will cost little if it turns out down the line you didn't want to make a modified copy. I think nearly all major languages pass arrays by true reference except ones that are truly functional for concurrency (PHP is very far departed from that). This is not an issue of doing something not supposed to be done but a poor internal implementation of arrays. Even if the framework can't be switched to not use COW by default it should still behave as PHP does. The traditional work around is the odd object wrapper. This way you can modify an array as such: $ref->arr["key"]["key"] = 123; and of course I have a C extension that can update a dynamic set of keys. It's not favourable to use objects everywhere because objects add a layer of indirection and other weird qualities. Basically the same problems with Javascript where objects and dictionaries are one and the same. A bit worse though because PHP arrays are already trying to be several things at once without also making them objects. Here is an example of what I would personally do in my own extensions if I needed a variable to an array to work with...
Example of speed improvement with dereferencing:
A took 0.23867607116699s empty A is faster despite the extra cost of resolving the property 'arr'. This gets a lot hairier when your structure parsing is much more complex, memory usage and when you consider the COW imposes some limitations on your data traversal pattern/program flow. The copy also will be deep when your structure has nested items that you might modify. |
Generally also stuff like
Have you looked at the declaration of ZEPHIR_CPY_WRT?
So I don't quite get your zephir example, It might help, if you could add an example, that compiles, which shows the slow-down in comparison to PHP, where I might be able to help you more. TLDR: Changing that behavior isn't feasible afaik, bad design decisions are hard to get rid of, probably won't happen soon, if somebody doesn't have ideas or a better implementation |
Legacy is always a pain. As far as I understand it a recursive grammar should make that easy to handle. Off the top of my head (huge simplification/pseudo): Perhaps it is related to PHP having less support for dereferencing previously (being limited previously) or because of the difficulty of working with that parser. When I look at the parsing it gives me a headache. I'm not sure at this stage what generates what. I tested it and it is definitely copying, however I never checked the code. It is the PH_SEPARATE flag that is doing it on update long. I could make it dereference the array manually then edit the code so that it treats that variable as I want it, just a basic temporary zval pointer. It is a shame it can't work straight off. |
@joeyhub You might want to take a look at this: zephir-lang/zephir#609 |
I will take a look and give some comments or suggestions if I find the time. I think this issue needs more work in the future particularly in the design. I don't think it is enough to just add support for pass by reference on top. It looks like zephir implements it's own crude COW which doesn't always work. I don't recommend implementing COW on top of things and hooking every array function. It seems to make things even more complex, for example if you use array_shift it wont COW because it hasn't hooked that function, unless there is something I am missing you'll have to coerce the array into a copy first (let a[0] = a[0];). It is quite unpredictable there. In my first test the zephir code is up to 2 times slower than PHP even though it is doing a huge amount of data processing which means there is little overhead crossing the PHP/C boundary. I'm seeing a possible hint of a small memory leak as well. On the other hand I have at least found a way to turn a huge PHP algorithm into C code that I can manage rather than having to write it from scratch. It wont be hard to get that to go from 2 times slower to > 10 times faster at all. The stability of the code produced is a bigger concern. I get a segfault if I have:
At least here we can give the heads up that using php functions that change arrays by reference will change your passed array by reference even if passed indirectly through parameters. |
The good news is that when I simply remove PH_SEPARATE it appears to behave identically to the PHP equivalent. I can't see any memory leaks so the memory management handles it. It is twice as fast as PHP like this and uses half as much memory. It should be functionally the same so I am a bit surprised about the lower memory usage. Although zephir is adding a lot of overhead it must be putting some of it to good use. Now it already compares to the expected performance improvement from using hhvm or php7 (possibly javascript as well which can be up to 10 times faster but I've seen it only 2 times faster for many tasks like this) and I haven't optimised it in anyway I hadn't optimised the original PHP where I had already coerced it not to COW for that one case. I have made it do everything exactly as my PHP version does it. -O2 might have also helped. If I carefully switch it so that it doesn't separate everywhere unless necessary I anticipate it should be possible to reach the well over > 10 times performance mark which I am aiming for. This is something I can't do in PHP without using anonymous objects all the time which is undesirable. Raw PHP-C could be around 100 times faster in the first version but would take 50 times longer to produce with ten times as many bugs that would be very hard to detect or fix. It takes a bit of effort but zephir pulls through. Two days is still much better than two weeks. I think it would be worth considering to pass arrays by reference the same as Javascript. This demonstrably yields massive performance gains in scripts that need to work heavily at mutating arrays. Although not always concise this gives most of what you would want from the pass by reference feature:
Perhaps before too many people adopt the language. |
I have a few issues with the syntax parsing in this case. It uses it's own parser so I would expect it to be better. Expressions should just work.
I have a weird little bit of code like this:
This fails to parse on several levels but should parse.
A secondary issue is whether or not assigning the array at that property to a variable will create a cow array or array by reference. When you need to create large structures COW arrays are extremely inefficient. I would hope the framework would provide arrays by reference by default and then a copy method for when a copy is needed. I have tested and it is COW, which is a major drawback. There is no benefit to it being COW in PHP, they just made it that way in the beginning. Importing this quirk from PHP is a major design flaw in zephir that defeats its purpose in many cases. I have worked with passing PHP arrays by pointer/reference in C extensions by hand and there is no problem with it. zvals are pointers already so it's really easy:
I don't think it is a good idea to make it too safe or managed like PHP CPP because we tested PHP CPP and half the time it gave no gains or so little gains that PHP code optimisation or using another scripting language would have done the trick. Because I can't do what I am doing in PHP in zephir my PHP code is more memory efficient by a factor of 10. zephir is useless to me until this is fixed. The point of passing around arrays in an object collection is to get around the limitations of PHP so without supporting dereferencing there PHP is faster than zephir.
At the very least I should be able to tell it that I want a pointer to the array in such and such a location rather than a copy.
var also does not support complex expressions. You need...
var appears to only allow literal/defalt assignment.
I cannot create a dynamic class. In PHP you can do new $classname();. I would expect new (classname)() to work but it does not.
In a switch statement I cannot declare a local variable.
Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.
The text was updated successfully, but these errors were encountered: