Unsupported Full-Integer TensorFlow Lite models in TF 2

**Describe the issue**
**In TF2, the full-integer quantized models produced by the TFLite Converter can only have *float* input and output type. This is a blocker for users who require *int8* or *uint8* input and/or output type.**

**UPDATE**: We now support this workflow.

**End-to-End Tutorial**: https://colab.sandbox.google.com/github/google-coral/tutorials/blob/master/retrain_classification_ptq_tf2.ipynb

**Only TFLite Conversion: Convert TF Models to TFLite Full-Integer models**
You can refer to the code [here](https://www.tensorflow.org/lite/performance/post_training_quantization#integer_only), also given below:


```
import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
def representative_dataset_gen():
  for _ in range(num_calibration_steps):
    # Get sample input data as a numpy array in a method of your choosing.
    yield [input]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_model = converter.convert()
```

**Only TFLite Inference: Run inference on the TFLite model**
**Note** that the one caveat with integer-only models is this -- you need to manually map (aka quantize) the float inputs to integer inputs during inference. To understand how this can be done -- refer to the equation provided in [TensorFlow Lite 8-bit quantization specification](https://www.tensorflow.org/lite/performance/quantization_spec) document and it's equivalent code in python below:
```
import numpy as np
import tensorflow as tf

# Input to the TF model are float values in the range [0, 10] and of size (1, 100)
np.random.seed(0)
tf_input = np.random.uniform(low=0, high=10, size=(1, 100)).astype(np.float32)

# Output of the TF model.
tf_output = keras_model.predict(input)

# Output of the TFLite model.
interpreter = tf.lite.Interpreter(model_content=tflite_model) 
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()[0]
# Manually quantize the input from float to integer
scale, zero_point = input_details['quantization']
tflite_integer_input = tf_input / scale + zero_point
tflite_integer_input = tflite_integer_input.astype(input_details['dtype'])
interpreter.set_tensor(input_details['index'], tflite_integer_input)
interpreter.invoke()
output_details = interpreter.get_output_details()[0]
tflite_integer_output = interpreter.get_tensor(output_details['index'])
# Manually dequantize the output from integer to float
scale, zero_point = output_details['quantization']
tflite_output = tflite_integer_output.astype(np.float32)
tflite_output = (tflite_output - zero_point) * scale
 
# Verify that the TFLite model's output is approximately (expect some loss in 
# accuracy due to quantization) the same as the TF model's output
assert np.allclose(tflite_output, tf_output, atol=1e-04) == True
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unsupported Full-Integer TensorFlow Lite models in TF 2 #38285

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unsupported Full-Integer TensorFlow Lite models in TF 2 #38285

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions