Skip to content

QDepthwiseConv2D produces wrong output — alternating channels zeroed in both io_parallel and io_stream (QKeras + Vivado backend) #1472

@TagwaM-M

Description

@TagwaM-M

Environment

  • hls4ml version: 1.3.0
  • Backend: Vivado
  • QKeras version: 0.9.0
  • TensorFlow: 2.13.1
  • Keras: 2.13.1
  • Python: 3.10
  • OS: Ubuntu 22.04 (WSL2 on Windows)
  • Vivado/Vitis HLS: 2022.1

Description

QDepthwiseConv2D layers produce systematically wrong outputs in both
io_parallel and io_stream. The output shows an alternating channel
pattern where even-indexed channels (c=0, c=2, c=4...) produce correct
values, while odd-indexed channels (c=1, c=3, c=5...) produce wrong large
positive values (~5-8), suggesting they are receiving identity-passed
activations instead of filtered outputs.

The bug is confirmed to be in the HLS computation template, not in
weight file generation. The weight files on disk exactly match the Keras
quantized weights (verified with np.allclose).

Model Architecture

ResNet-style model with depthwise separable blocks:

  • QDepthwiseConv2D + QConv2DBatchnorm (fused BN)
  • Residual skip connections (Add layers)
  • Branching: one tensor feeds two paths (GlobalAveragePooling + DepthwiseConv)
  • Input shape: (16, 35, 1)
  • Channel progression: 1 → 16 → 32 → 64
def KQ():
    return quantized_bits(bits=8, integer=3, keep_negative=True, 
                          alpha=1.0, symmetric=False)

# Example of affected layer
x = QDepthwiseConv2D(
    kernel_size=3, strides=1, padding='same', use_bias=False,
    depthwise_quantizer=KQ(), bias_quantizer=KQ(), 
    name='refine_low_dw'
)(feat)  # feat branches to both GlobalAveragePooling and this layer

Reproduction Steps

import hls4ml
import numpy as np

# 1. Build and train model (QDepthwiseConv2D + QConv2DBatchnorm)
# 2. Create deploy model
deploy_model = keras.Model(inputs=..., outputs=[hm_head_low, count_softmax])

# 3. Convert
hls_config = hls4ml.utils.config_from_keras_model(
    deploy_model, granularity='name', backend='Vivado',
    default_precision='ap_fixed<16,12>'
)
hls_model = hls4ml.converters.convert_from_keras_model(
    deploy_model,
    hls_config=hls_config,
    backend='Vivado',
    io_type='io_parallel',   # bug present in io_stream too
    part='xczu9eg-ffvb1156-1-e',
)
hls_model.compile()

# 4. Compare
X = np.ascontiguousarray(val_x[:1], dtype=np.float32)
keras_out = deploy_model.predict(X)
hls_out   = hls_model.predict(X)

Observed vs Expected Output

Layer: refine_low_dw (QDepthwiseConv2D, 64 channels, 3x3)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions