Python - Doesn't do real for loops

24. February 2012 07:00

 

Previous I wrote about python and its weakness for 2d arrays. This has now got me thinking a little more out of the box.

 

I have also discovered that python doesn't actually support the tradition for loop. It supports a for each loop instead which has issues when attempts to run quite large loops. This is a simple example of one of these issues which you can only ever discover when it is too late normally resulting in a crash of some kind.

 

To start with lets look at the simple loop.

 

>>> for i in range(0, 10):
...  print i
...
0
[cut]
9
>>>

 

 

It does of course print 0 - 9 as output. But look at how it is constructed. It uses the range function and range does't produce a number range. It produces a list of numbers from the specified paramaters on the function. So whats the big deal?

 

Well if you need to run a "big" for loop in python you simply cannot use it ... Well not without using lots and lots of memory. Not to mention that the entire list has to be created before the first for loop runs. Which is just going to make it dog slow to begin with.

 

So to get python to use around 170MB of memory (in a for loop) simply write

 

>>> for i in range(0, 10000000):
...     print i

 

 

But wait... 10 million? That a very big number. Or is it? It is not uncommon for large data sets to grow to 10 millions records. This would be exactly the sort of place you might want to use python as well.

 

So I guess the alternative to this is to role your own for loop using a while loop. Which I have had todo for various other things for variable stride lengths across array's. Though it kinda defets the entire purpose of there being a for loop in the first place.

 

As a bonus surprise while playing with this feature it probably also open's up a whole new range of security holes if a loop counter is supplied from end user input. This would lead to using massive amounts of memory. The real surprise I was greated with the following from the linux kernel who's out of memory killer could have also picked apache or a database engine to kill in its place.

 

 

[5537286.542945] Out of memory: kill process 14166 (python) score 230114 or a child
[5537286.543168] Killed process 14166 (python)

 

 

But I guess thats python's way of doing things!

E-mail Kick it! DZone it! del.icio.us Permalink


Comments (3) -

3/9/2012 9:56:53 AM #

Try running this with Python 3.2 (maybe also earlier 3.x). This should handle it without the large RAM footprint.

Konrad Germany | Reply

3/9/2012 4:48:04 PM #


However since the syntax of many things in python 3 is not backwards compatible this can be a problem in many situations!

james United Kingdom | Reply

3/11/2012 2:52:54 AM #

wiki.python.org/.../PerformanceTips

gv Italy | Reply

Add comment




  Country flag
biuquote
  • Comment
  • Preview
Loading