Token Classification: Named Entity Recognition
This cookbook walks through building, deploying, and using a Named Entity Recognition (NER) model with Encoderfile. We'll use BERT fine-tuned for NER to identify people, organizations, and locations in text.
What You'll Learn
- Export a token classification model to ONNX
- Build a self-contained encoderfile binary
- Deploy as a REST API server
- Make predictions via HTTP
- Use CLI for batch processing
Prerequisites
encoderfileCLI tool installed (Installation Guide)- Python with
optimum[exporters]for ONNX export curlfor testing the API
Step 1: Export the Model
We'll use dslim/bert-base-NER, a BERT model fine-tuned for named entity recognition.
About the Model
This model recognizes 4 entity types:
- PER - Person names
- ORG - Organizations
- LOC - Locations
- MISC - Miscellaneous entities
Export to ONNX
# Install optimum if you haven't already
pip install optimum[exporters]
# Export the model
optimum-cli export onnx \
--model dslim/bert-base-NER \
--task token-classification \
./ner-model
What files are created?
The export creates:
Step 2: Create Configuration
Create a YAML configuration file for building the encoderfile.
Configuration Options
name- Model identifier used in API responsespath- Directory containing ONNX model filesmodel_type- Must betoken_classificationfor NERoutput_path- Where to save the binary (optional)transform- Optional Lua script for post-processing
Step 3: Build the Binary
Build your self-contained encoderfile binary:
# Create output directory
mkdir -p build
# Build the encoderfile
encoderfile build -f ner-config.yml
Build Output
You should see output like:
The resulting binary is completely self-contained - it includes:
- ONNX model weights
- Tokenizer
- Full inference runtime
- REST and gRPC servers
Step 4: Start the Server
Launch the encoderfile server:
# Make executable (if needed)
chmod +x ./build/ner-tagger.encoderfile
# Start server
./build/ner-tagger.encoderfile serve
Server Startup
The server is now running with both HTTP and gRPC endpoints.
Step 5: Make Predictions
Now let's test the NER model with different types of text.
Example 1: Basic Entity Recognition
Entities Found:
- Mozilla →
B-ORG,I-ORG(Organization) - San Francisco →
B-LOC(Location) - CA →
B-LOC(Location)
The B- prefix indicates the beginning of an entity, I- indicates inside/continuation, and O means outside any entity.
Example 2: Multiple Sentences
Sentence 1: - Yvon → Person (PER) - Patagonia → Organization (ORG)
Sentence 2: - Eiffel Tower → Miscellaneous (MISC) - Paris → Location (LOC) - France → Location (LOC)
Step 6: CLI Inference
For batch processing or one-off predictions, use the CLI directly:
Single Input
./build/ner-tagger.encoderfile infer \
"Tim Cook presented the new iPhone at Apple Park in California."
Batch Processing
./build/ner-tagger.encoderfile infer \
"Amazon was founded by Jeff Bezos in Seattle." \
"Mozilla's headquarters are in San Francisco, California." \
"Marie Curie won the Nobel Prize in Physics." \
-o results.json
This saves all results to results.json for further processing.
Advanced Usage
Custom Ports
HTTP Only (Disable gRPC)
Production Deployment
# Copy to system location
sudo cp ./build/ner-tagger.encoderfile /usr/local/bin/
# Run as a service (example with systemd)
/usr/local/bin/ner-tagger.encoderfile serve \
--http-hostname 0.0.0.0 \
--http-port 8080
Understanding the Output
Token Classification Labels
The model uses the IOB (Inside-Outside-Beginning) tagging scheme:
| Prefix | Meaning | Example |
|---|---|---|
B- |
Beginning of entity | B-PER for "Barack" in "Barack Obama" |
I- |
Inside/continuation | I-PER for "Obama" in "Barack Obama" |
O |
Outside any entity | O for "is" or "the" |
Entity Types
| Label | Description | Examples |
|---|---|---|
PER |
Person names | "John Smith", "Marie Curie" |
ORG |
Organizations | "Apple Inc.", "United Nations" |
LOC |
Locations | "Paris", "California", "Mount Everest" |
MISC |
Miscellaneous | "iPhone", "Nobel Prize" |
Response Format
{
"results": [
{
"tokens": ["word1", "word2", ...], // Tokenized input
"logits": [[...], [...], ...], // Raw model outputs
"predicted_labels": ["B-PER", "O", ...] // Predicted entity tags
}
],
"model_id": "ner-tagger"
}
Troubleshooting
Unexpected Entity Recognition
Model Limitations
The model may struggle with:
- Rare or domain-specific entities
- Ambiguous contexts (e.g., "Washington" as person vs. location)
- Non-English text
- Very long sequences (>512 tokens)
Solution: Fine-tune on domain-specific data or use a specialized model.
Performance Optimization
If inference is slow:
# Consider adding a transform to reduce output size
transform: |
function Postprocess(arr)
-- Only return top prediction per token
return arr:argmax(3)
end
Server Connection Issues
# Check if server is running
curl http://localhost:8080/health
# Try different port
./build/ner-tagger.encoderfile serve --http-port 8081
Next Steps
- Sequence Classification Cookbook - Build a sentiment analyzer
- Embedding Cookbook - Create a semantic search engine
- Transforms Reference - Learn about custom post-processing
- API Reference - Complete API documentation
Summary
You've learned to:
- ✅ Export a token classification model to ONNX
- ✅ Build a self-contained encoderfile binary
- ✅ Deploy as a REST API server
- ✅ Make predictions via HTTP and CLI
- ✅ Understand NER output format
The encoderfile you built is production-ready and can be deployed anywhere without dependencies!