Use Psyco, and Python will work as quickly, as well as With

Using Psyco, the compiler of processing Python


In some respects design Python reminds design Java. Both use the virtual machine which interprets pseudo-broadcast bajtkod. The area in which virtual machine Java has bypassed Python is an optimization of performance bajtkoda. Psyco, the compiler of processing Python, helps "to equalize the score". Now Psyco is an external module, but sometime he can be switched on in itself Python. Absolutely slightly additional programming, and Psyco it is possible to use, that on the order to speed up code Python. In this clause{article} David considers{examines}, that is Psyco, and also tests it{him} in some applications.


Usually Python quickly enough does{makes} that you want. Ninety percent{interests} of questions which arise at beginning{starting} programmers concerning to speed of execution{performance} interpretive / bajtkompiliruemykh the languages similar Python, are simply naive. On modern hardware the majority of not optimized programs Python are executed so quickly as from them it is required, and actually it be no point to spend additional efforts to programming that the application worked faster.


In this clause{article}, hence, I kosnus` only the staying ten percent. From time to time programs on Python (or programs in other programming languages) are carried out is invalid slowly. Decided{solved} questions can be the most different; seldom the prize in milliseconds, but acceleration of problems{tasks} which are carried out within minutes is required, hours, days or weeks often costs{stands} efforts. Moreover, it is necessary to notice, what not everything, that works slowly, is called by the central processor. If, for example, on performance of search to a database hours are required, has no special value, whether the resulting data set within one minute or two is processed. This clause{article} as well not about problems of an input / conclusion.


There are some ways to speed up the program on Python. The first, that should come in a head to any programmer, is to improve algorithms or used structures of the data. Microoptimization of steps of inept algorithm destiny of fools. For example, if the order of complexity of current technology O (n ** 2), tenfold acceleration of steps is much less useful, than a presence{finding} of replacement O (n). This rule is applicable{applied} even by consideration of such exclusive approach, as reprogramming on the assembler: the correct algorithm on Python will be carried out often faster, than the incorrect algorithm shifted manually on the assembler.


The second technology which you should consider first of all is a profiling application Python from a position of reprogramming of key parts as modules of expansion on S.Ispolzuja a wrapper of expansions, as, for example, SWIG (see. Resources), you can create expansion With which carries out as a code on About elements of your program which absorb most of all time. Such expansion Python is rather simple, but demands some time for development (and knowledge S). You will see, that very much often the lion's share of time removed{assigned} on performance of your application on Python, is spent on gorstku functions and consequently the significant prize is possible{probable}.


The third technology is based on the second. Grehg EHuing (Greg Ewing) has created language Pyrex which connects Python and With. In particular, to use Pyrex, you write functions on similar with Python language which adds to the chosen variables of the announcement of types. Pyrex (means) will transform pyx-files to expansions ".s". After translation by the compiler With, these modules Pyrex (language) can be imported and used in your usual applications Python. As Pyrex uses practically same syntax, as well as itself Python (including directives of a cycle, a branch and exception, the form of assignment [assignment forms], the structured arrangement and so on), to the programmer on Pyrex does not need to study With to write expansions. Moreover, Pyrex supposes more integral - in comparison with the expansion written directly on With - mixture of variables of a level With with variables of level Python within the limits of the same code.


Subject of given clause{article} - one more technology. The module of expansion Psyco can be built in bowels of interpreter Python and is selective replace parts interpretive bajtkoda Python with the optimized machine code. As against the techniques described above, Psyco works exclusively during execution{performance} Python. In other words, initial code Python is broadcast by python command in bajtkod the same way, as before (except for pair import directives and calls of the functions added for start Psyco). However, while interpreter Python carries out the application, Psyco sometimes does{makes} checks to find out, whether he can replace usual operations bajtkoda Python with some processed machine code. This processable translation not only is very similar that compilation in a place (just-in-time compilers) Java (at least, in a broad sense does{makes}), but also depends on architecture. Now, Psyco it is accessible only to architecture with the processor i386. Charm Psyco consists that you can use same code Python, which you wrote all this time (literally!) but to execute it{him} is faster.

As Psyco works


Completely to understand Psyco, to you, probably, it is required to understand well and function eval_frame () interpreter Python, and in the assembler i386. Unfortunately, I cannot apply for a role of the expert, but I think, that I can explain Psyco in general, not supposing too serious mistakes.


In usual Python function eval_frame () is an internal cycle of interpreter Python. Basically, function eval_frame () looks on current bajtkod in a context of performance and switches management in the function suitable to realization of it bajtkoda. Specificity of that this function of support will do{make}, depends, in general, from a status of the various objects Python which is taking place in memory. We shall explain - summation of objects Python "2" and "3" results in the result which is distinct from summation of objects "5" and "6", though both operations are processed equally.


Psyco replaces function eval_frame () with the compound estimated module. There are some ways, allowing Psyco to improve that does{makes} Python. First, Psyco broadcasts operations in to some extent optimized a machine code; in itself it results only in minor improvements as that the machine code should execute, is the same that processed functions Python do{make}. Moreover, that is "processed" in translation Psyco, it is more, than a choice bajtkoda Python, Psyco also specifies values of variables which are known in a context of performance. For example, in the code similar resulted below, a variable x raspoznavaema during a cycle:



x = 5

l = []

for i in range (1000):

l.append (x*i)


The optimized version of this code does not need to multiply everyone i on " contents of variable / object x " - less prodigally simply to multiply everyone i on 5, excluding thus the search / indirect link.


Besides creation of the code intended for i386, for small operations, Psyco keshiruet this broadcast machine code for further use. If Psyco it is capable to establish, that separate operation same as executed (and "processed") earlier, he can rely on this kehshirovannyj a code, instead of broadcast this segment. It saves a little more time.


The real economy in Psyco, however, grows out distributions of operations on three various levels. For Psyco there are variables " time of execution{performance} ("run-time") ", time of translation " ("compile-time") and " virtual time " ("virtual-time"). If necessary Psyco moves variables from one level on another. Variables of time of execution{performance} is simply initial bajtkod and structures of object which are processed with usual interpreter Python. Variables of time of translation are displayed in machine registers and cells of memory with direct access as soon as Psyco broadcasts these operations in a machine code.


The most interesting level are variables of virtual time. From within variable Python is a full structure with set of members - even if the object represents only an integer. Variables of virtual time Psyco represent objects Python which can be potentially constructed if there will be such necessity, but their details neglect, while necessity no. For example, we shall consider the following giving:



x = 15 * (14 + (13 - (12 / 11)))


Standard Python builds and destroys a number{line} of objects to calculate this value. Whole integer (integer) the object is created to contain value (12/11); then value is taken from structure of this time object and used for calculation of new time object (13-PyInt.) Psyco passes{misses} these objects and simply calculates values, knowing, that "if necessary" the object can be created from this value.



Using Psyco


It is much easier to use Psyco, than it{him} to explain. Basically, everything, that is necessary, is to say Psyco what function / method "to process". It is not necessary to change a code of any of your functions Python, classes.


There is a pair the approaches intended for the instruction{indication}, that Psyco should do{make}. The "Armo-piercing" approach is to allow Psyco to carry out everywhere operation of compilation in a place. To make it, place the next lines of a code in the beginning of your module:



import psyco; psyco.jit ()

from psyco.classes import *


The first line specifies Psyco to apply the magic to all global functions. The second line (for Python 2.2 and is higher) orders Psyco to make the same with methods of a class. Little bit more precisely to direct behaviour Psyco, you can use the command:



psyco.bind (somefunc) * or method, class

newname = psyco.proxy (func)


The second form leaves func as standard function Python, but optimizes calls which will involve newname. Practically in all cases, except for testing and debugging, the form psyco.bind () is that you will use.



Productivity Psyco


Despite of magic Psyco, his{its} use demands some judgement and testing. The main thing that it is necessary to understand is that Psyco is useful to processing repeatedly vypolnjajuhhiekhsja cycles, and that he knows how to optimize operations in which the whole are involved and numbers from a floating point. For not cyclic functions and for operations above other types of objects Psyco usually simply adds an overhead charge for the analysis and internal compilation. Moreover, for applications with the big number of functions and classes, inclusion Psyco for the application entirely is additional burden at translation of a machine code and use of memory for this caching. Much better selectively to connect those functions which can take more advantage from optimization Psyco.


I began from the most simple testing. I have simply thought, what from applications which I recently started, it would be quite good to speed up. The first example which has occurred to me, is a program of a manipulation the text, which I use for transformation of a draft variant of my future book " Text processing in Python " (Text Processing in Python) in format LaTeX. This application uses some lower case methods, some regular expressions and the certain program logic controlled mainly by regular expressions and concurrences of lines. Actually it is the awful candidate for Psyco but as I use it{him}, I began from him{it}.


At the first call, everything, that I have made is have added psyco.jit () in the beginning of my script. Without serious consequences enough. Unfortunately, results were (predictably) depressing. If originally the script worked 8.5 seconds after "acceleration" with Psyco he was carried out 12 seconds. Nasty! I have guessed, that compilation in a place, probably, has some overhead charge for start which tighten{delay} time of execution{performance}. Therefore the following, that I have made, there was a processing of much larger entrance file (consisting of set of copies of an initial file). It has led to to the extremely modest success - execution time was reduced since 120 seconds up to 110. This improvement remained steady at several start, but in any case rather insignificant.


The second call with my candidate on text processing. Instead of addition of a call psyco.jit () without parameters, I have added only a line psyco.bind (main), as function main () has some cycles (but minimally uses arithmetic operations with integers). In this case results were nominally better. This approach has cut down execution time for some tenth second for the first example and for some seconds for the version of the big entrance file. But still anything impressing (though also any harm).


For more worthy testing Psyco, I have dug out a certain code of a neural network about which I wrote in one of previous to clause{article} (see. Resources). This application "code_recognizer" can be adjusted for an identification of possible{probable} distributions of different values ASCII in various programming languages. Something similar could be potentially useful at guessing types of a file (we shall say, the lost network packages); but this code actually is completely universal concerning to what he has been trained - with the same success he could learn to distinguish persons or sounds, or kinds of inflow. In any case "code_recognizer" it is based on library Python bpnn, which as a control example also is switched on (in the modified form) in distribution kit Psyco 0.4. That it is important to know about "code_recognizer" in view of this clause{article}, so that it calculates many cycles with numbers from a floating point and that his{its} performance borrows{occupies} a lot of time. At us the good candidate for Psyco also has appeared.


Having experimented it is a little, I have in details established how to use Psyco. For this application in case of a small amount of classes and functions has no special value, you use the directed linkage or in a place. But the best result - for some percent - still appears behind selective linkage of classes which are in the best way optimized. It is more important to understand area of linkage Psyco, however.


The script "code_recognizer" contains lines like:



class NN2 (NN):

* customized output methods, math core inherited


In other words, the interesting moment from point of view Psyco is in a class bpnn. NN. Addition psyco.jit () or psyco.bind (NN2) in a script code_recognizer.py is of no use. That Psyco carried out desirable optimization, you will need to add psyco.bind (NN) in code_recognizer.py, or psyco.jit () in bpnn.py. As against that you could assume, compilation in a place occurs not at creation of a copy or a call of methods, and at the description of a class. In addition, at connection of derivative classes their inherited methods are not processed.


As soon as details of the suitable linkage Psyco have been produced, resulting acceleration appeared rather impressing. Using the same examples of testing and a mode of training of a network, which has been submitted in mentioned clause{article} (500 patterns of training, 1000 iterations of training), time of training of a neural network has been reduced since somewhere 2000 seconds about approximately 600 seconds - more, than three times. Reduction of number of iterations up to 10 has shown proportional acceleration (but utterly worthless recognition of a neural network) - as well as intermediate number of iterations.


I find rather remarkable acceleration of the code which was carried out on half an hour, till 10 minutes, with the help of two lines of a code. This acceleration, probably, still is less than speed of execution{performance} of the similar application on With, and definitely less than tenfold acceleration which is registered in some single instances of testing Psyco. But this application is obvious from " a real life ", and these improvements are sufficient that them to note in many contexts.



Where goes Psyco?


At the moment Psyco does not make statistics or profiling and carries out only the minimal optimization of a generated machine code. Probably, later version will know how to define{determine} operations which could win more all from optimization, and throw out from kesha a machine code for not optimized operations. In addition, probably, the future Psyco could decide to carry out more extensive (but also more expensive{dear}) optimization of repeatedly executed operations. Such analysis of execution{performance} would be similar that technology HotSpot of company Sun does{makes} for Java. That circumstance, that Java, as against Python, has declarations of types, actually less significantly, than many (however previous jobs on optimization Self, Smalltalk, Lisp and Scheme also mark it) think.


Though I also doubt, what is it sometime happen, but would be healthy, if the technology such as Psyco has been integrated into any future version itself Python. Some lines for import and linkages - not too are a lot of job but then Python it would be carried out essentially faster, and it would be primary much more cel`no. Time will tell.