Skip to content

Mem usage and Increment equal for many lines#234

@PiotrZiolo

Description

@PiotrZiolo

I was searching for a memory leak in my code and for debugging purposes I split a for loop into separate lines where I fill a dict with some objects. These objects are created there. Here's the report from memory_profiler:

Line # Mem usage Increment Line Contents ================================================ 44 1125.984 MiB 1125.984 MiB @profile 45 def __init__(self, priors=None, base_classes=None, seed=9): 46 """ 47 :param pd.DataFrame priors: DataFrame with three columns: date, hour_of_week, 48 prior. The last column defines priors (in the form of dictionaries) 49 for given dates and hours of week. 50 :param pd.DataFrame base_classes: DataFrame with three columns: date, hour_of_week, 51 base_class. The last column defines priors (in the form of dictionaries) 52 for given dates and hours of week. 53 :param int seed: Seed for the random number generator. 54 """ 55 56 1125.984 MiB 0.000 MiB self.models ={} # type: Dict[str, CompetitiveClickProbabilityModule] 57 58 1125.984 MiB 0.000 MiB self.seed = seed 59 1125.984 MiB 0.000 MiB self.rng = np.random.RandomState(seed) 60 1125.984 MiB 0.000 MiB self.priors = priors 61 1125.984 MiB 0.000 MiB self.base_classes = base_classes 62 63 1125.984 MiB 0.000 MiB seed_min = 100000 64 1125.984 MiB 0.000 MiB seed_max = 999999 65 1125.984 MiB 0.000 MiB seeds = self.rng.randint(low=seed_min, high=seed_max, size=len(self.priors)) 66 67 1126.316 MiB 0.332 MiB base_df = priors.copy() 68 1126.387 MiB 0.070 MiB base_df.loc[:, "base_class"] = base_classes["base_class"] 69 1127.254 MiB 0.867 MiB base_df.loc[:, "seed"] = seeds 70 71 1127.254 MiB 0.000 MiB self.models ={} # type: Dict[str, SimulatorModule] 72 73 1127.383 MiB 1127.383 MiB self.models["{}.{}".format(base_df["date"][0], base_df["hour_of_week"][0])] = base_df["base_class"][0](base_df["prior"][0], base_df["seed"][0]) 74 1127.508 MiB 1127.508 MiB self.models["{}.{}".format(base_df["date"][1], base_df["hour_of_week"][1])] = base_df["base_class"][1](base_df["prior"][1], base_df["seed"][1]) 75 1127.898 MiB 1127.898 MiB self.models["{}.{}".format(base_df["date"][2], base_df["hour_of_week"][2])] = base_df["base_class"][2](base_df["prior"][2], base_df["seed"][2]) 76 1128.383 MiB 1128.383 MiB self.models["{}.{}".format(base_df["date"][3], base_df["hour_of_week"][3])] = base_df["base_class"][3](base_df["prior"][3], base_df["seed"][3]) 77 1128.758 MiB 1128.758 MiB self.models["{}.{}".format(base_df["date"][4], base_df["hour_of_week"][4])] = base_df["base_class"][4](base_df["prior"][4], base_df["seed"][4]) 78 1129.176 MiB 1129.176 MiB self.models["{}.{}".format(base_df["date"][5], base_df["hour_of_week"][5])] = base_df["base_class"][5](base_df["prior"][5], base_df["seed"][5]) 79 1129.633 MiB 1129.633 MiB self.models["{}.{}".format(base_df["date"][6], base_df["hour_of_week"][6])] = base_df["base_class"][6](base_df["prior"][6], base_df["seed"][6]) 80 1130.070 MiB 1130.070 MiB self.models["{}.{}".format(base_df["date"][7], base_df["hour_of_week"][7])] = base_df["base_class"][7](base_df["prior"][7], base_df["seed"][7]) 81 1130.508 MiB 1130.508 MiB self.models["{}.{}".format(base_df["date"][8], base_df["hour_of_week"][8])] = base_df["base_class"][8](base_df["prior"][8], base_df["seed"][8]) 82 1130.883 MiB 1130.883 MiB self.models["{}.{}".format(base_df["date"][9], base_df["hour_of_week"][9])] = base_df["base_class"][9](base_df["prior"][9], base_df["seed"][9]) 83 1131.320 MiB 1131.320 MiB self.models["{}.{}".format(base_df["date"][10], base_df["hour_of_week"][10])] = base_df["base_class"][10](base_df["prior"][10], base_df["seed"][10]) 84 1131.758 MiB 1131.758 MiB self.models["{}.{}".format(base_df["date"][11], base_df["hour_of_week"][11])] = base_df["base_class"][11](base_df["prior"][11], base_df["seed"][11]) 85 1132.156 MiB 1132.156 MiB self.models["{}.{}".format(base_df["date"][12], base_df["hour_of_week"][12])] = base_df["base_class"][12](base_df["prior"][12], base_df["seed"][12]) 86 1132.633 MiB 1132.633 MiB self.models["{}.{}".format(base_df["date"][13], base_df["hour_of_week"][13])] = base_df["base_class"][13](base_df["prior"][13], base_df["seed"][13]) 87 1133.008 MiB 1133.008 MiB self.models["{}.{}".format(base_df["date"][14], base_df["hour_of_week"][14])] = base_df["base_class"][14](base_df["prior"][14], base_df["seed"][14]) 88 1133.434 MiB 1133.434 MiB self.models["{}.{}".format(base_df["date"][15], base_df["hour_of_week"][15])] = base_df["base_class"][15](base_df["prior"][15], base_df["seed"][15]) 89 1133.883 MiB 1133.883 MiB self.models["{}.{}".format(base_df["date"][16], base_df["hour_of_week"][16])] = base_df["base_class"][16](base_df["prior"][16], base_df["seed"][16]) 90 1134.320 MiB 1134.320 MiB self.models["{}.{}".format(base_df["date"][17], base_df["hour_of_week"][17])] = base_df["base_class"][17](base_df["prior"][17], base_df["seed"][17]) 91 1134.758 MiB 1134.758 MiB self.models["{}.{}".format(base_df["date"][18], base_df["hour_of_week"][18])] = base_df["base_class"][18](base_df["prior"][18], base_df["seed"][18]) 92 1135.141 MiB 1135.141 MiB self.models["{}.{}".format(base_df["date"][19], base_df["hour_of_week"][19])] = base_df["base_class"][19](base_df["prior"][19], base_df["seed"][19]) 93 1135.570 MiB 1135.570 MiB self.models["{}.{}".format(base_df["date"][20], base_df["hour_of_week"][20])] = base_df["base_class"][20](base_df["prior"][20], base_df["seed"][20]) 94 1136.008 MiB 1136.008 MiB self.models["{}.{}".format(base_df["date"][21], base_df["hour_of_week"][21])] = base_df["base_class"][21](base_df["prior"][21], base_df["seed"][21]) 95 1136.418 MiB 1136.418 MiB self.models["{}.{}".format(base_df["date"][22], base_df["hour_of_week"][22])] = base_df["base_class"][22](base_df["prior"][22], base_df["seed"][22]) 96 1136.883 MiB 1136.883 MiB self.models["{}.{}".format(base_df["date"][23], base_df["hour_of_week"][23])] = base_df["base_class"][23](base_df["prior"][23], base_df["seed"][23]) 

As you can see starting from line 73 the Increment seems to be broken as it is always equal to Mem usage. The Mem usage suggests that it should be below 1MB. Is it a bug or my misunderstanding how it is supposed to work?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions