QDepthwiseConv2D produces wrong output — alternating channels zeroed in both io_parallel and io_stream (QKeras + Vivado backend)

## Environment
- hls4ml version: 1.3.0
- Backend: Vivado
- QKeras version: 0.9.0
- TensorFlow: 2.13.1
- Keras: 2.13.1
- Python: 3.10
- OS: Ubuntu 22.04 (WSL2 on Windows)
- Vivado/Vitis HLS: 2022.1

## Description
`QDepthwiseConv2D` layers produce systematically wrong outputs in both 
`io_parallel` and `io_stream`. The output shows an alternating channel 
pattern where even-indexed channels (c=0, c=2, c=4...) produce correct 
values, while odd-indexed channels (c=1, c=3, c=5...) produce wrong large 
positive values (~5-8), suggesting they are receiving identity-passed 
activations instead of filtered outputs.

The bug is confirmed to be in the **HLS computation template**, not in 
weight file generation. The weight files on disk exactly match the Keras 
quantized weights (verified with np.allclose).

## Model Architecture
ResNet-style model with depthwise separable blocks:
- QDepthwiseConv2D + QConv2DBatchnorm (fused BN)
- Residual skip connections (Add layers)
- Branching: one tensor feeds two paths (GlobalAveragePooling + DepthwiseConv)
- Input shape: (16, 35, 1)
- Channel progression: 1 → 16 → 32 → 64

```python
def KQ():
    return quantized_bits(bits=8, integer=3, keep_negative=True, 
                          alpha=1.0, symmetric=False)

# Example of affected layer
x = QDepthwiseConv2D(
    kernel_size=3, strides=1, padding='same', use_bias=False,
    depthwise_quantizer=KQ(), bias_quantizer=KQ(), 
    name='refine_low_dw'
)(feat)  # feat branches to both GlobalAveragePooling and this layer
```

## Reproduction Steps

```python
import hls4ml
import numpy as np

# 1. Build and train model (QDepthwiseConv2D + QConv2DBatchnorm)
# 2. Create deploy model
deploy_model = keras.Model(inputs=..., outputs=[hm_head_low, count_softmax])

# 3. Convert
hls_config = hls4ml.utils.config_from_keras_model(
    deploy_model, granularity='name', backend='Vivado',
    default_precision='ap_fixed<16,12>'
)
hls_model = hls4ml.converters.convert_from_keras_model(
    deploy_model,
    hls_config=hls_config,
    backend='Vivado',
    io_type='io_parallel',   # bug present in io_stream too
    part='xczu9eg-ffvb1156-1-e',
)
hls_model.compile()

# 4. Compare
X = np.ascontiguousarray(val_x[:1], dtype=np.float32)
keras_out = deploy_model.predict(X)
hls_out   = hls_model.predict(X)
```

## Observed vs Expected Output

**Layer: refine_low_dw (QDepthwiseConv2D, 64 channels, 3x3)**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QDepthwiseConv2D produces wrong output — alternating channels zeroed in both io_parallel and io_stream (QKeras + Vivado backend) #1472

Environment

Description

Model Architecture

Reproduction Steps

Observed vs Expected Output

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

QDepthwiseConv2D produces wrong output — alternating channels zeroed in both io_parallel and io_stream (QKeras + Vivado backend) #1472

Description

Environment

Description

Model Architecture

Reproduction Steps

Observed vs Expected Output

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions