OSError: [Errno 30] Read-only file system " .chroma"

*I’m trying to run a program to use langchain and chroma on the server module. I guess chroma needs to write a temporary file, .chroma somewhere.

Because I need to install those libraries, I’m on Python 3.10 Beta.

Anyone know how to tell it to save it in the /tmp/ directory?

Thanks!

I have no idea, but I asked Claude 2, and i got this (I’mon my call now). You may be lucky, here it is:


Here are a few suggestions for directing Chroma to use the /tmp directory without setting environment variables:

  • Set the CHROMA_CACHE_DIR config variable before importing Chroma:
import chroma
chroma.config.cache_dir = '/tmp/.chroma'
  • Monkey-patch Chroma’s get_default_cache_dir() to return /tmp/.chroma:
import chroma
def get_tmp_cache_dir():
    return '/tmp/.chroma'
chroma.get_default_cache_dir = get_tmp_cache_dir
  • Subclass Chroma’s Configuration and override the cache_dir property:
from chroma import Configuration

class MyConfig(Configuration):
    @property
    def cache_dir(self):
        return '/tmp/.chroma'

chroma.config = MyConfig()
  • Set the location of the Chroma config file to use a custom config:
chroma.config_file = '/path/to/myconfig.ini' 

Where myconfig.ini contains:

[chroma]
cache_dir = /tmp/.chroma

The key is to configure Chroma’s cache directory before it gets used for the first time. Let me know if any of those suggestions help or if you have any other questions!

Thanks so much for the help. Unfortunately, none of those things worked. I think you have the right idea about setting the directory. the LangChain AI chatbot gave erroneous suggestions, too.

I’m using Google Drive integration to get around no writing to the anvil system.

Chroma has to write a file (.chroma) to a directory which I can set using a db_path parameter, but I can’t seem to pass it a google drive folder. Help, please! I have the google drive folder loaded as:

gdriveFolder = app_files.tempuserfiles

I’m speculating, but I doubt you’ll be able to pass it a google drive folder. I’m guessing you’ll have to use the file system. It may be easier to help you if you shared more details about the errors you’re seeing, what you’ve tried, etc., with the gold standard being a clone link to a simplified demo app demonstrating the issue. It may also be helpful to link to relevant chroma docs, etc.

I don’t mind using the filesystem, if I can.

Here’s the current app: Anvil | Login

Just Click on the Upload files buttons for Step 1 and 2 using the files below then under Step 3, click on See Results!

Here are 2 dummy files to use:

Step1: …ContentStandards_dummy.csv
Step2: Unit1_Only_DUMMY.csv

The only error I get is
OSError: [Errno 30] Read-only file system: '.chroma'

  • at /usr/local/lib/python3.10/os.py:225
  • called from /usr/local/lib/python3.10/os.py:215
  • called from /home/anvil/.env/lib/python3.10/site-packages/chromadb/db/index/hnswlib.py:209
  • called from /home/anvil/.env/lib/python3.10/site-packages/chromadb/db/index/hnswlib.py:123
  • called from /home/anvil/.env/lib/python3.10/site-packages/chromadb/db/index/hnswlib.py:142
  • called from /home/anvil/.env/lib/python3.10/site-packages/chromadb/db/clickhouse.py:639
  • called from /home/anvil/.env/lib/python3.10/site-packages/chromadb/api/local.py:260
  • called from /home/anvil/.env/lib/python3.10/site-packages/chromadb/api/local.py:318
  • called from /home/anvil/.env/lib/python3.10/site-packages/chromadb/api/models/Collection.py:299
  • called from /home/anvil/.env/lib/python3.10/site-packages/langchain/vectorstores/chroma.py:150
  • called from /home/anvil/.env/lib/python3.10/site-packages/langchain/vectorstores/chroma.py:430
  • called from /home/anvil/.env/lib/python3.10/site-packages/langchain/vectorstores/chroma.py:462
  • called from /home/anvil/.env/lib/python3.10/site-packages/langchain/indexes/vectorstore.py:78
  • called from /home/anvil/.env/lib/python3.10/site-packages/langchain/indexes/vectorstore.py:73
  • called from tagger, line 87
  • called from start, line 137
1 Like

image

I think this should be

chroma = Chroma(persist_directory="/tmp/.chroma")

instead of db_path=, but I am unsure if this really needs to be inside a with context block any longer.
…or really if this will even work the way that you want, since if more than one user is using your site at the same time they would be sharing the same temporary file, which could cause aberrant behaviour.

2 Likes

One way to solve that last problem would be to use

with anvil.media.TempFile() as temp_chroma_file:
  chroma = Chroma(persist_directory=str(temp_chroma_file))

but every single line of the rest of your code that uses chromadb will have to exist within this block.

Or you could refactor all the rest of your code to be inside a function that gets called from inside this temporary file with block, passing in the randomized name of the temporary file to the function.
This way the “persistent” database file is instead, randomized and ephemeral.

… or you could create a tempfile name directly from the session ID of the user, so two different users could not collide, however there is no guarantee that the same file will still exist between server calls, even in the same session.

1 Like

for now this is just a proof of concept, so multiple users won’t be a problem. Thanks for your help. I’ll give this a shot and mark it as a solution if this works. I appreciate the effort and thoroughness!