WisdomInterface

New features to get to production faster

Models are typically trained in full precision, such as FP32, which can be too large with too much latency for inference and production. With the latest release, Intel Distribution of OpenVINO toolkit adds a new tool called post-training optimization to convert models into low-precision formats, such as int8. That means developers can reduce latency, memory, and on-disk footprint without having to retrain their models.

The new release features support for custom layers deployed with Intel Movidius VPUs. Previously, some developers would need to customize certain deep learning “layers” in their trained models based on use case, latency, or other requirements. Now, they can do this customization across platforms with custom layer support for VPUs, as well as CPUs and iGPUs, from Intel.

Add comment

Subscribe for more insights

By completing and submitting this form, you understand and agree to WisdomInterface processing your acquired contact information as described in our privacy policy.

No spam, we promise. You can update your email preference or unsubscribe at any time and we'll never share your details without your permission.