Character changed when get_bytes() is executed

ardianlama · March 26, 2020, 3:25pm

Hi,

I’m new on Anvil but I find it very useful and pleasant to work with. I’m trying to build a simple web app with a text area and a button. Everything is fast and easy but I have stumbled upon a little trouble. When the app tries to get_bytes() from a text file it changes special characters like ‘ë’ or ‘ç’ to some gibberish characters. The same app has no trouble to write in the text file the right characters with set_bytes() function. I would really appreciate if someone could help me with that.

Thanks and regards,

campopianoa · March 26, 2020, 3:31pm

Hello and welcome,

When you have your text string and are ready to display it, could you try prefixing your string with a u. For example:

self.text_area_1.text=u'ěščřžýáí' # replace with your text but keep the `u`

If this is the reason for the issue, here is some background explanation:

ardianlama · March 26, 2020, 4:01pm

Hello @alcampopiano,

Thanks for your reply. I’m using the following form on the show function:

def text_area_1_show(self, **event_args):
   f = app_files.filename_txt
   self.text_area_1.text = f.get_bytes()

Not sure how the ‘u’ prefix can be used. Should I use some special syntax?

Thanks,

campopianoa · March 26, 2020, 4:02pm

Could you share a clone of your app? Click the icon in the IDE and follow the instructions.

If the app contains personal information do not share, but if not, this will help me debug quicker. It will be a simple solution for sure.

ardianlama · March 26, 2020, 4:14pm

Sure,

Here’s the link: https://anvil.works/build#clone:RMDBR6YVNDM5FIXU=PZU3P7IRRCV6RKVXUAG6WPFH

Thanks

campopianoa · March 26, 2020, 4:33pm

Could you try converting your bytes to a string?

Try one of the following to see which works:

my_bytes = app_files.filename_txt.get_bytes()
my_string=my_bytes.decode('utf-8')

Or:

my_bytes = app_files.filename_txt.get_bytes()
my_string=str(my_bytes, 'utf-8')

If you can manage to get the bytes into a string with the right encoding I think you should be okay.

Let me know if either of these work. I can’t test with your actual file since you are pulling it from your Google environment which I don’t have access to.

ardianlama · March 26, 2020, 4:50pm

Hi @alcampopiano,

It works perfectly with all the special characters. I used the first option.
Thanks a lot. I think I’ll stick with anvil for the comprehensive workflow and its great forum.

Bests,

campopianoa · March 26, 2020, 4:52pm

Oh I’m so glad. To be honest I was just guessing a little bit and there may be better solutions.

For me, using Anvil was a no brainer. It just made web life so much easier.

Good luck with your development.

meredydd · March 26, 2020, 6:20pm

This is the right answer, @alcampopiano! When you read the file, you get a bytes object, which represents byte sequences rather than characters. To turn it into a string, you have to decode it with a particular encoding, which tells Python how to interpret those bytes as characters, and that’s what the code snippets you posted do. (For text, utf-8 is almost always the right encoding).