Articles

🦀🐍 Documenting Native Python Extensions Made With Rust and PyO3

Native Python extensions written in Rust become more and more popular. The PyO3 framework allows developing them effortlessly, providing the necessary tools to integrate Rust’s performance with Python’s flexibility.

However, there is a lack of information on how to document those extensions to help their users get acquainted with public API. Recently we developed an extension with a sophisticated API, so the problem became critical.

To solve the issue, we dived into the topic and are ready to share with you our recipe allowing to build of beautiful documentation for Rust/PyO3-based Native Python extensions.

YOLO v7 Inference Acceleration With Structural Pruning

Delve into the realm of deep learning optimization with our in-depth article on structured pruning for YOLOv7 DNN. Discover how this advanced technique revolutionizes inference speed without compromising model accuracy. We dissect the intricacies of structured pruning, showcasing its ability to streamline deep learning pipelines by reducing model complexity.

With concrete evidence from rigorous experimentation, we demonstrate the impressive benefits of structured pruning in accelerating YOLOv7 DNN performance. If you’re an AI enthusiast, researcher, or developer seeking to optimize object detection models, this article is a must-read. Follow the link below to gain technical insights into leveraging structured pruning for faster YOLOv7 DNN inference.

Read the article

Sharing Ctypes Structures and NumPy Arrays Between Unrelated Processes With Use of POSIX Shared Memory in Python3

Various interprocess communication mechanisms are well supported by standard python libraries such as Threading and Multiprocessing. However, these means are designed to implement IPC mechanisms between related processes, that is, those that are generated from a common ancestor and thus inherit IPC objects. However, it’s often required to use IPC facilities in unrelated processes that start independently. In this case, named IPC objects (POSIX or SysV) should be used, which allow unrelated processes to obtain an IPC object by a unique name. This interaction is not supported by standard Python tools.

Python 3.8 introduced the multiprocessing.shared_memory library, which is the first step to implementing IPC tools for communication of unrelated processes. This article was just conceived as a demonstration case of this library usage. However, everything went wrong. As of November 29, 2019, the implementation of shared memory in this library is incorrect – the shared memory object is deleted even if the process just wants to stop using the object without the intention of deleting it. Despite the presence of two calls close () and unlink (), regardless of their call or non-call, the object is deleted when any of the processes using the object terminates.

Read more ...

Video stream copying to a ring buffer of files with fixed duration using OpenCV, Python, and ZeroMQ

The problem described in this article is often met in intelligent video analytics solutions. In general, a user wants to get an access to a fragment that contains some identified events. A fragment is a set of one or more small video files that contain both the event itself and the history and development of the situation associated with it.

It is convenient to solve this task with a ring buffer of video fragments, presented by small files, for example, for 30 seconds. So, when the application detects that some of them contain important signals, it copies the files that include the signal from the ring buffer into the permanent storage. The buffer is a ring because old files are deleted from the disk (for example, after 10 minutes have passed), so the buffer always takes a fixed amount of space on the storage.

You will learn how to develop such a ring buffer, which connects to the main video processing pipeline and manages the creation and deletion of video files that form the buffer. To solve the problem OpenCV, Python, LZ4, and ZeroMQ will be used. For simplicity, we assume that the FPS for video files is the same as FPS of a stream, that is, all video frames from the stream are written to files. In real implementations, removal of redundant frames from a stream, a change in resolution, etc., may take place.

Read more ...

High-performance Scalable Realtime Distributed Video Processing With Python3, OpenCV and ZeroMQ

OpenCV is a popular framework widely used in the development of products for intelligent video analytics. Such solutions use both classic algorithms of computer vision (e.g. an algorithm for optical flow detection), and AI-based approaches, in particular, neural networks.

Most of such algorithms are resource-intensive and require a large amount of computing resources to process even one video stream in real time. By resources are meant both CPU cores and GPUs, as well as other hardware accelerators.

The camera or other source delivers a video stream with a certain number of frames per second (FPS, Frames Per Second), which must be processed by the analytical platform one by one. Each video frame takes a significant amount of memory, for example, for a resolution frame of 4K with a color depth of 24 bits, a NumPy ndarray array will occupy 24 MB in RAM, so for a 1-second interval these frames will take 729 MB for a 30 FPS stream. With poor performance of the processing pipeline, system may meet a failure situation due to RAM or disk space overflow, or frame loss. Thus, the pipeline must be efficient enough to be able to process every single frame within certain amount of milliseconds (e.g. 33 ms for a video with 30 FPS).

Read more ...