Python - Doesn't do real for loops

24. February 2012 07:00

 

Previous I wrote about python and its weakness for 2d arrays. This has now got me thinking a little more out of the box.

 

I have also discovered that python doesn't actually support the tradition for loop. It supports a for each loop instead which has issues when attempts to run quite large loops. This is a simple example of one of these issues which you can only ever discover when it is too late normally resulting in a crash of some kind.

 

To start with lets look at the simple loop.

 

>>> for i in range(0, 10):
...  print i
...
0
[cut]
9
>>>

 

 

It does of course print 0 - 9 as output. But look at how it is constructed. It uses the range function and range does't produce a number range. It produces a list of numbers from the specified paramaters on the function. So whats the big deal?

 

Well if you need to run a "big" for loop in python you simply cannot use it ... Well not without using lots and lots of memory. Not to mention that the entire list has to be created before the first for loop runs. Which is just going to make it dog slow to begin with.

 

So to get python to use around 170MB of memory (in a for loop) simply write

 

>>> for i in range(0, 10000000):
...     print i

 

 

But wait... 10 million? That a very big number. Or is it? It is not uncommon for large data sets to grow to 10 millions records. This would be exactly the sort of place you might want to use python as well.

 

So I guess the alternative to this is to role your own for loop using a while loop. Which I have had todo for various other things for variable stride lengths across array's. Though it kinda defets the entire purpose of there being a for loop in the first place.

 

As a bonus surprise while playing with this feature it probably also open's up a whole new range of security holes if a loop counter is supplied from end user input. This would lead to using massive amounts of memory. The real surprise I was greated with the following from the linux kernel who's out of memory killer could have also picked apache or a database engine to kill in its place.

 

 

[5537286.542945] Out of memory: kill process 14166 (python) score 230114 or a child
[5537286.543168] Killed process 14166 (python)

 

 

But I guess thats python's way of doing things!

E-mail Kick it! DZone it! del.icio.us Permalink


Python - 2d Array's don't work.

22. February 2012 20:25

 

If you have been working with python you will notice that 2d array's just don't work. This might come as a surprise since almost every other modern programming language supports 2d array's. What might even come as a bigger surprise is that pythin doesn't support array's at all in its language. It uses lists to support this functionality. It might look like an array but it is actually a list.

 

This can be showen with the following. Note the error about the list.

 

 

arr = []
arr[0] = 1


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list assignment index out of range

 

 

So once you stop thinking about trying to create and array in pythin (which is easy since they don't exist) and start thinking about creating a list of items things become much easyier to understand why your 2d arrays of arr = [][] just doesnt work. As you have just attempts to create 2 lists and set a variable with them.

 

So to create a single array list is easy and we can do so using the following code.

 

>>> arr = range(0,5)
>>> print arr
[0, 1, 2, 3, 4]

>>> arr[0] = 1
>>> print arr
[1, 1, 2, 3, 4]

 

 

I actually find this really ugly. Since we just created a list of incrementing number and the array still needs to be set to zero. So to create a real array preset to zero you need to use the following.

 

 

>>> arr = []
>>> for i in range(0, 5):
...  arr.append(0)
...
>>> print arr
[0, 0, 0, 0, 0]

 

 

So moving onto the 2d array and you do exactly the same thing. Except you create a list and put it in the list item. Like this

 

 

>>> arr = []
>>> for i in range(0, 5):
...  x = []
...  for j in range(0, 5):
...   x.append(0)
...  arr.append(x)
...
>>> print arr
[
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]
]

 

Now you can read / write to the items by using an array like syntax or arr[0][0] .. arr[4][4].

 

There is another alternative support to using 2d arrays in python by using a single list and then calulating the offsets. However for the particular problem I was trying to solve it really didn't help since python doesn't appear to support proper for loops either!

 

I guess thats what you get with python!

E-mail Kick it! DZone it! del.icio.us Permalink