Google Magika AI

Hi all,

Last week Google open-sourced Magika, their AI-powered file-type identification system which helps detect binary and textual file types.

It’s actually pretty handy and really easy to work with, as it comes with a Python edition we can slot straight into Anvil apps.

Add the package: “magika”

In your sever code:

from magika import Magika

#@software{magika,
#author = {Fratantonio, Yanick and Bursztein, Elie and Invernizzi, Luca and Zhang, Marina and Metitieri, Giancarlo and Kurt, Thomas and Galilee, Francois and Petit-Bianco, Alexandre and Farah, Loua and Albertini, Ange},
#title = {{Magika content-type scanner}},
#url = {https://github.com/google/magika}
#}

def magika_check(bytes):
  m = Magika()
  result = m.identify_bytes(bytes)
  type = result.output.ct_label
  return type

All you need to do is pass your file bytes (file.get_bytes) to it!

6 Likes