Sunday 15 March 2015

Project Progress

Project Progress
Project Progress: PERL5
This post is about a project I am starting in my SPO600 class that requires me to optimize a portion of the lamp stack.

Areas to optimize

I have chosen Perl5 as the package that I will be working on and have found a few areas that may be good areas to optimize and make changes.
The first two areas I found using the perl todo list.
The first was tail call optimization:
Seen at line 1130.
This would essentially have me find areas where tail call optimization is possible and rewrite them to implement it. Here is a link that explains what tail call optimization is: TCO

The second area I found was in regards to Perls regular expression engine.
Seen at line 1154
In their engine certain regular expressions end up taking exponential time. They have a workaround for this called super-linear cache but they say the code has not been well maintained and could use improvement. I found the location of this problem in the source by grepping for the keyword 'super-linear', found at regexec.c.
It seems like this could be an area for optimization although I am not very confident about how I would attempt this or what I would change to improve it because I do not have a strong knowledge of how a regular expression engine works.

The final one I found by looking through the perl 5 git repository(Instructions here), I found some sections of inline assembly by grepping for the keyword 'asm' using 'grep -r asm ./*' these sections were in a file called os2.c:
my_emx_init()
my_os_version
These functions may potentially be able to be ported to aarch64 syntax. I am a bit uncertain about this area because I am not sure what this code does or if it is important or not.

Why Perl?

I chose Perl for my project because the community seems really clear and organized. They have a todo list with various tasks which is very useful and as you can see above it helped me a lot with regards to finding areas to work on. Also they have a very active community, In their mailing list archive, there are daily messages which makes me confident that if I need help or need to ask a question I won't be waiting for extended periods of time.

Proceeding

Looking at my 3 options I believe the tail call optimizations might have a large impact depending on how many areas I can find. I would like to implement some code involving the aarch64 platform because that would relate to the SPO600 course the most but I am uncertain about the inline assembly code that I have found so far. The regular expression area seems really interesting, but I am afraid it would not be feasible given the time I have, it is something I will definitely consider if my project doesn't go as planned.
Proceeding with this project I plan on starting to work out how to apply the tail call optimizations while I engage with the upstream community about which direction is the best for them and for me. I also plan on benchmarking Perl on x86 and aarch64 to see if I can find any further areas or functions that may let me perform a platform specific optimization.

Perl Upstream

Perl has a relatively straightforward guide on their website here.
To summarize, if you have a patch either use perlbug or send it to perlbug@perl.org. Once the patch has been processed it will be posted on the mailing list for discussion, you are encouraged to join the discussion and promote your patch. They recommend using git, You can get the source by using 'git clone git://perl5.git.perl.org/perl.git perl', Once you make changes you can use git diff to make a patch, this compares your branch and the main branch to produce the patch.

Conclusions

This project has made me the most nervous of any project I have had so far. It is filled with uncertainties, a couple weeks ago I was uncertain I would even find anything to work on but eventually I did. Now I am uncertain on which direction to go and whether or not my contributions will be accepted. Regardless of what happens it is a great learning experience and I now appreciate the complexity of large projects like Perl or other packages in the lamp stack.

No comments:

Post a Comment