Environment
- hls4ml version: 1.3.0
- Backend: Vivado
- QKeras version: 0.9.0
- TensorFlow: 2.13.1
- Keras: 2.13.1
- Python: 3.10
- OS: Ubuntu 22.04 (WSL2 on Windows)
- Vivado/Vitis HLS: 2022.1
Description
QDepthwiseConv2D layers produce systematically wrong outputs in both
io_parallel and io_stream. The output shows an alternating channel
pattern where even-indexed channels (c=0, c=2, c=4...) produce correct
values, while odd-indexed channels (c=1, c=3, c=5...) produce wrong large
positive values (~5-8), suggesting they are receiving identity-passed
activations instead of filtered outputs.
The bug is confirmed to be in the HLS computation template, not in
weight file generation. The weight files on disk exactly match the Keras
quantized weights (verified with np.allclose).
Model Architecture
ResNet-style model with depthwise separable blocks:
- QDepthwiseConv2D + QConv2DBatchnorm (fused BN)
- Residual skip connections (Add layers)
- Branching: one tensor feeds two paths (GlobalAveragePooling + DepthwiseConv)
- Input shape: (16, 35, 1)
- Channel progression: 1 → 16 → 32 → 64
def KQ():
return quantized_bits(bits=8, integer=3, keep_negative=True,
alpha=1.0, symmetric=False)
# Example of affected layer
x = QDepthwiseConv2D(
kernel_size=3, strides=1, padding='same', use_bias=False,
depthwise_quantizer=KQ(), bias_quantizer=KQ(),
name='refine_low_dw'
)(feat) # feat branches to both GlobalAveragePooling and this layer
Reproduction Steps
import hls4ml
import numpy as np
# 1. Build and train model (QDepthwiseConv2D + QConv2DBatchnorm)
# 2. Create deploy model
deploy_model = keras.Model(inputs=..., outputs=[hm_head_low, count_softmax])
# 3. Convert
hls_config = hls4ml.utils.config_from_keras_model(
deploy_model, granularity='name', backend='Vivado',
default_precision='ap_fixed<16,12>'
)
hls_model = hls4ml.converters.convert_from_keras_model(
deploy_model,
hls_config=hls_config,
backend='Vivado',
io_type='io_parallel', # bug present in io_stream too
part='xczu9eg-ffvb1156-1-e',
)
hls_model.compile()
# 4. Compare
X = np.ascontiguousarray(val_x[:1], dtype=np.float32)
keras_out = deploy_model.predict(X)
hls_out = hls_model.predict(X)
Observed vs Expected Output
Layer: refine_low_dw (QDepthwiseConv2D, 64 channels, 3x3)
Environment
Description
QDepthwiseConv2Dlayers produce systematically wrong outputs in bothio_parallelandio_stream. The output shows an alternating channelpattern where even-indexed channels (c=0, c=2, c=4...) produce correct
values, while odd-indexed channels (c=1, c=3, c=5...) produce wrong large
positive values (~5-8), suggesting they are receiving identity-passed
activations instead of filtered outputs.
The bug is confirmed to be in the HLS computation template, not in
weight file generation. The weight files on disk exactly match the Keras
quantized weights (verified with np.allclose).
Model Architecture
ResNet-style model with depthwise separable blocks:
Reproduction Steps
Observed vs Expected Output
Layer: refine_low_dw (QDepthwiseConv2D, 64 channels, 3x3)