Transformers library not able to run AutoModelForSeq2SeqLM

I try to use transformer and load

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, GenerationConfig

my code:

import anvil.files from anvil.files import data_files import anvil.tables as tables import anvil.tables.query as q from anvil.tables import app_tables import anvil.server

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, GenerationConfig

PATH = data_files['models--google--flan-t5-base/snapshots/c782cba52f8ea6a704240578055cf1c3fc2f2ca9']
model_name = 'google/flan-t5-base'
tokenizer = AutoTokenizer.from_pretrained(PATH, local_files_only=True) 
model = AutoModelForSeq2SeqLM.from_pretrained(model_name) 
config = GenerationConfig(max_new_tokens=200)

@anvil.server.callable
def chat(msg):
  tokens = tokenizer(question, return_tensors="pt")
  outputs = model.generate(**tokens, generation_config=config)
  return tokenizer.batch_decode(outputs, skip_special_tokens=True)

I get an error message:
The cause of the problem is AutoModelForSeq2SeqLM from the transformers library
AutoModelForSeq2SeqLM.from_pretrained(model_name) # <— Here we get this error: anvil.server.ExecutionTerminatedError: Server code execution process was killed. It may have run out of memory: 38f6cbd3ca

How can I solve that?
Cheer Aaron

As you’re on a Business plan, we can provide an increased RAM allowance for your server code (this will be self-serve soon :wink: ). Are you using Python 3.10?

Hi,
yes please increase the RAM allowance for us.
We use Python 3.10 (base package: Machine learning). Would you suggest otherwise.
Another issue: as we are on Business plan our quotas shall be
1.000.000 datarow (we have just 150.000 now)
100 GB Storage (we have just 10GB) important for the model flan-t5-xl = has 10GB
Thanks a lot
Aaron

@meredydd
hope you have not forgotten me.
Cheers Aaron