This is a scheme I’m working on. Comments are invited.
A simple-to-use, reusable, chainable set of text-validators.
Introduction
This minimal framework includes support for
- sanitizing badly-pasted input text
- validating text against zero or more criteria (pre-defined and custom)
- converting the text to other types of values (e.g., numbers)
The latter is often critical for performing additional sanity-checks.
While this was originally written for use by Anvil.works’ text-entry fields, it is independent of any specific data-sources.
Operating principles
This framework uses a “whiteboard” metaphor, borrowed from expert-system architecture. The data source writes its initial data (the entered text) to a whiteboard (a dictionary). Then it calls in your chosen expert (a function) to review the whiteboard contents.
If the input passes inspection, the expert leaves without comment. If it fails, then the expert leaves a description of the error on the whiteboard.
The caller can do anything it wants with that message: display it, log it, raise an exception, … The expert doesn’t need to know, which keeps the expert (function) simple and focused on the diagnostic task.
An expert doesn’t have to know and do everything by itself. (In fact, such a beast can be very tricky to write and maintain.) Instead, it can call upon other proven experts, in exactly the same way.
In fact, we include two trivial mechanisms for “building” an expert, using other experts as building-blocks. This allows every expert to be as simple and well-focused as possible.
Informally, we can divide experts into the three categories noted above: sanitizers, validators, and converters. We’re sure you can find more uses.
Specifics
A whiteboard is simply a Python dict. By convention:
- “text” contains the input string (possibly “sanitized”).
- “err_msg”, if present, contains the error message
- “value”, if present, contains the converted value (e.g., a number)
You can easily extend this system with tags of your own, as needed.
Here’s a sample whiteboard, as provided by the caller:
wb = {‘text’: ’ your text here '}
print(wb)
{‘text’: ’ your text here '}
We can define a validator quite simply:
def length_check(wb) :
… if len(wb[‘text’] > 10 :
… wb[‘err_msg’] = ‘must be 10 characters or less’
Notice that we’re not even returning a value. The mere presence of the error message tells the caller that the text has failed the test.
The caller will invoke the validator as follows:
length_check(wb)
This changes the whiteboard contents:
print(wb)
{‘err_msg’: ‘must be 10 characters or less’,‘text’: ’ your text here '}
Thereafter, we can ask the whiteboard how things went:
print(validation_failed(wb))
True
print(validation_succeeded(wb))
False
This is sometimes useful when an expert calls upon other experts. Otherwise, we leave such testing to the caller.
I’m currently building this scheme, including some (useful?) sample “experts”.
I’m also building an Anvil-specific caller, that will link any TextBox to such a validator, for plug-and-play operation.
I’ve tried simpler designs, but they all seemed to make the job harder. With this approach, any validation job should be decomposible into a small set of trivially-reused parts, one part per criterion, or per transformation.
I’ve also skimmed a number of 3rd-party validation libraries on the Python Package Index. They’re aimed at checking things after you’ve got them all collected into a composite structure.
That’s no help here. For data-entry purposes, the user should have immediate feedback, preferably before they even leave the entry field. The sooner they understand what’s needed, the better. My scheme supports that.
Does it look to you like I’m on the right track? Or at least a useful one?