Memory is not released in uplink server

fei · September 1, 2024, 11:05am

What I’m trying to do:
I’m using uplink server to do some calculations using QuantLib (a package in C++ with SWIG). The calculation in the uplink server takes a lot of memory. I notice that when the calculation is completed and result is returned to Anvil, the memory is not released to OS.

I use htop to monitor the memory usage of the uplink server process.

What I’ve tried and what’s not working:
Can’t think of anything I can do.

Uplink Server Code Sample:

import anvil.server
import QuantLib as ql

anvil.server.connect("server_YTY7C3R3AIR2TCB35DL3DJBV-XCYMXVY7YLI4MHKO")

@anvil.server.callable
def run_test(num):
  result = consume_memory(num)
  print('Run completed!')
  return result

def consume_memory(num):
    # Create a list to hold many MemoryConsumer objects
    objects = []

    # Create 1 million objects, each consuming a significant amount of memory
    for i in range(num):
        objects.append(create_asset(1000))

    return ("Created a lot of objects to consume memory.")

def create_asset(notional):

    # Step 1: Set the valuation date
    valuation_date = ql.Date(31, 8, 2024)
    ql.Settings.instance().evaluationDate = valuation_date

    # Step 2: Define the calendar, conventions, and day count
    calendar = ql.TARGET()
    business_convention = ql.ModifiedFollowing
    day_count = ql.Actual360()

    # Step 3: Define the overnight index (e.g., EONIA for EUR, FedFunds for USD)
    eonia_fixing_date = ql.Date(30, 8, 2024)
    eonia_rate = 0.0015  # Example rate

    # Add the fixing to the Eonia index
    overnight_index = ql.Eonia()
    overnight_index.addFixing(eonia_fixing_date, eonia_rate)

    fixed_rate = 0.01  # Example fixed rate of 1%
    flat_forward_rate = ql.FlatForward(valuation_date, ql.QuoteHandle(ql.SimpleQuote(fixed_rate)), day_count)
    discount_curve_handle = ql.YieldTermStructureHandle(flat_forward_rate)
    overnight_index = ql.Eonia(discount_curve_handle)

    # Step 4: Define the fixed leg parameters
    fixed_rate = 0.01  # Example fixed rate of 1%
    fixed_frequency = ql.Annual
    fixed_day_count = ql.Thirty360(ql.Thirty360.BondBasis)

    # Step 5: Define the swap maturity
    start_date = valuation_date
    maturity_date = calendar.advance(start_date, ql.Period(2, ql.Years))  # 2-year OIS

    # Step 6: Define the notional amount
    # notional = 1000000  # Example notional of 1,000,000

    # Step 7: Create the fixed leg schedule
    fixed_schedule = ql.Schedule(
        start_date,
        maturity_date,
        ql.Period(fixed_frequency),
        calendar,
        business_convention,
        business_convention,
        ql.DateGeneration.Forward,
        False
    )

    # Step 8: Create the Overnight Indexed Swap (OIS)
    ois = ql.OvernightIndexedSwap(
        ql.OvernightIndexedSwap.Payer,  # Swap type (Payer/Receiver)
        notional,
        fixed_schedule,
        fixed_rate,
        fixed_day_count,
        overnight_index,
    )

    # Step 9: Set up a discount curve for pricing (using the fixed rate as the discount rate for simplicity)
    discount_curve = ql.FlatForward(valuation_date, ql.QuoteHandle(ql.SimpleQuote(fixed_rate)), day_count)
    discount_curve_handle = ql.YieldTermStructureHandle(discount_curve)

    # Step 10: Set the pricing engine and calculate NPV and fair rate
    ois_engine = ql.DiscountingSwapEngine(discount_curve_handle)
    ois.setPricingEngine(ois_engine)

    # Output the results
    # print("NPV of the OIS:", ois.NPV())
    # print("Fair fixed rate:", ois.fairRate())

    return ois

anvil.server.wait_forever()

Clone link:
Anvil | Login

tobias.carlbom · September 1, 2024, 1:15pm

Hi,

Can this be any help? python - Deleting variable does not erase its memory from RAM memory - Stack Overflow

Eg

import numpy as np
import gc

a = np.array([1,2,3])
del a
gc.collect()

In your case it will look like this

def consume_memory(num):
    # Create a list to hold many MemoryConsumer objects
    objects = []

    # Create 1 million objects, each consuming a significant amount of memory
    for i in range(num):
        objects.append(create_asset(1000))

    # Explicitly delete the list
    del objects

    # Manually trigger garbage collection
    import gc
    gc.collect()

    return ("Created a lot of objects to consume memory.")

But i guess that you want to return the list called objects in the consume_memory method?
Perhaps it is worth a try to make the list called objects a global variable, and move the deletion of the variable and the trigger of empty the garbage collector after you are done working with the data. Perhaps not optimal, but its worth a try for starters

fei · September 1, 2024, 1:51pm

Thank you very much! I don’t want to the consume_memory function to return the objects. The objects shall be created in this function, then we will do some calculation based on those objects and then only return the result. The objects shall be deleted. Therefore the whole life cycle of the objects shall be within the function. My understanding was when the consume_memory function is returned, all objects created in this function shall be deleted implicitly.

I have tried the code above to delete the objects explicitly and call gc.collect(). Unfortunately it doesn’t make a difference.

Btw, I have waited for about 10mins after the run finished. The memory usage of the process still doesn’t change (i.e. the memory has not been returned to OS).

stefano.menci · September 1, 2024, 2:35pm

If you call the function twice, does the memory usage double, or does it seem to recycle the old memory?

Some libraries allocate and keep the memory for later use.

tobias.carlbom · September 1, 2024, 3:36pm

Oh I see

Just a thought, perhaps it might be worth spawning a separate thread for “the heavy work”. After the thread is done it should release the memory (from my understanding).

This might help us isolating the problem

fei · September 1, 2024, 3:46pm

I tried this. The memory usage doesn’t change if I run it the second time. The usage increases when I create more objects than before. It appears the memory usage is high water mark.

fei · September 1, 2024, 3:50pm

It could be a workaround.

stefano.menci · September 1, 2024, 3:55pm

I don’t know the library you are using, but it’s likely that they have their own garbage collector and they allocate memory when needed and they never release it, to avoid wasting time releasing memory and reallocating it later.

Usually this is what you want, because if you had enough memory when you used it first, you will have enough memory to keep it allocated between usages.

(This could be the behavior of Python itself, I’ve never tested if Python actually releases the memory after creating and deleting a large list of strings.)

If you really want to release the memory, then you can either dig into the options of that library (or Python itself) and see if there is a way to control it.

Or you could try to spawn a new process. @tobias.carlbom suggests to spawn a new thread. You can try that, but Python has one garbage collector for all the threads in the process, so it may not help. Spawning a new process will work. It could be slow in Windows, but in Linux will be much faster and could be a good solution.

daviesian · September 2, 2024, 9:15am

Hi @fei,

There are some excellent answers in this thread, which I think can be summarised as “the library you’re using isn’t releasing memory when you want it to”. To be clear, I don’t think this is anything to do with the Uplink, and would be true in a plain Python script too. The suggestion of spawning a new process is the right one here. You can either do this manually, or make use of the multiprocessing module.

I hope this helps!