tfmot.sparsity.keras.PruneForLatencyOnXNNPack

Specifies to prune only 1x1 Conv2D layers in the model.

Inherits From: PruningPolicy

Used in the notebooks

Used in the guide
Pruning for on-device inference w/ XNNPACK

PruneForLatencyOnXNNPack checks that the model contains a subgraph that can leverage XNNPACK's sparse inference and applies pruning wrapper only to Conv2D with kernel_size = (1, 1).

Reference
Fast Sparse ConvNets XNNPACK Sparse Inference

Reference

Fast Sparse ConvNets
XNNPACK Sparse Inference

Methods

`allow_pruning`

View source

allow_pruning(
    layer
)

Allows to prune only 1x1 Conv2D layers.

`ensure_model_supports_pruning`

View source

ensure_model_supports_pruning(
    model
)

Ensures that the model contains only supported layers.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2023-05-26 UTC.