python - Reducing memory used by a large dict -
i need create in memory object has keys 9 digit integer , boolean value associated each key. i've been using dict in simplified example below:
#!/usr/bin/python __future__ import print_function import sys mydict = {} n in range(56000): mydict[n] = true print('count:',len(mydict),' size:', sys.getsizeof(mydict))
i need able , retrieve boolean value associated each key. problem size of dict. using python 2.7 on 64 bit linux system , above example, size of dict 3.1 megabytes according sys.getsizeof(). (about 56 bytes per entry store 9 digits plus boolean value)
i need store boolean state of (approx) 55.000 entries in dict. each dict key 9 digit integer. i've tried using integer , str(theinteger) keys no change in size of dict.
is there other sort of data structure or methodology should using conserve memory such large data set?
if boolean integer key, , range of keys starts 0 , continuous, there's no reason not use list:
my_list = [] n in range(56000): my_list[n] = true
or better:
my_list = [true n in range(5600])
if not enough, try array
module , use 1 byte per bool:
import array my_array = array.array("b", (true n in range(56000)))
and if not enough, try bitarray module on pypi.
another idea use set
: have many more false
true
, have set:
my_true_numbers = {0, 2323, 23452} # true ones
and check with
value = number in my_true_numbers
if have more true
false
, other way round.
Comments
Post a Comment