Read .docx/.pdf file in anvil

Hi everyone,
I would like to make a interface that the user can read the content inside the pdf file or word file through Anvil interface, after the file has been uploaded.

But I don’t know how to code it out since I am still beginner of Python language. Please give me some idea and suggested code for a reference. Thank you very much.

You would need to pull in the PDF or Doc file into a server module, where you could then parse it with a library. Here is a link to the list of packages available on the server modules. https://anvil.works/docs/server/packages

If a package you need is missing you can request it to be installed. Note: not all packages are available on the free tier.

For PDF’s PyPDF2 is available. You can find the PyPI page here. For Doc files I don’t know but I’m sure there are numerous packages out there if one is not already installed.

2 Likes

Thanks robert, let me try it first. Thank you very much

To give you the relevant section it the docs, this should get you started in how to process the file on the server.

https://anvil.works/docs/media#files-in-server-modules

https://anvil.works/build#clone:J3Q42AMB6X7GZ7YY=BXH74QZT2ZFC4IYR7BG3IUTQ

here is a simple example of embedding a pdf from your datatable on the front end form.

Im not sure if the endpoint or custom html will be available to you if you are on free account though.

4 Likes

HTTP endpoints and Custom HTML components are just fine on the free plan, @joinlook :slight_smile:

1 Like

A post was split to a new topic: Error when cloning app

This looks fantastic! Super helpful, thanks!

Just wondering if it’s possible to server the file via any other means than http endpoints, or perhaps some form of authentication?

Correct me if I’m wrong, it looks like this method is currently such that all files are just publicly available if you have the URL?

You are not likely to get any help when you answer to an old question with another question.

You will have better luck if you ask a new question with a proper subject.