Bitworks Software Blog

How to Run Low-Latency Jobs With Apache Spark

April 12, 2019

Ivan Kudriavtsev

Apache Spark, Low Latency, Python, Scala, PySpark

Apache Spark is a quite popular framework for massive scalable data processing. It is the heart of big-data processing systems for many companies which is very convenient for small computations in a single workstation, a corporate server or a high-performance cluster with thousands of nodes. Apache Spark has a very sophisticated design and, at the same time, an easy development model for software developers, which is especially important on early stages of the product adoption. The most attractive feature of Spark is that when the computations are designed well, Spark utilizes all the available compute capacity. Engineers don’t care about parallelization, multithreading, multiprocessing, and other stuff – all the magic happens inside Spark.

Improving InfluxDB with Apache Kafka

March 21, 2019

Ivan Kudriavtsev, Ekaterina Maslova (English translation)

Time series, TSDB, InfluxDB, Apache Kafka

Why do we love InfluxDB? That’s because this is an outstanding product that allows working with time series easily. It provides high performance for both data insertion and retrieval. It offers us a SQL-like query language with convenient functions for processing time-series data (for example, a derivative of values). It is supported by convenient visualization tools, such as Grafana. It provides continuous queries that handle data aggregation on the fly. And also for the fact that one can get started with InfluxDB within a couple of hours.

Why do we avoid using InfluxDB in projects? That’s because the cluster solution is not open-source. One has to pay for scaling and fault tolerance by purchasing a license. There is nothing wrong with this, however, in the concept of developing software based on free software, when all infrastructure components must be published under open licenses, there is no place for commercial products. As a result, the introduction of InfluxDB in the critical places of information systems is impossible.

It’s interesting – Apache Cassandra, Kafka, HDFS, Elasticsearch, and many others provide clustered solutions for free which leads to their greater adoption in the projects.

In this article, we will illustrate how to use a supplementary Apache Kafka cluster to implement the scalability and fault tolerance of InfluxDB for popular use cases, without purchasing a commercial license for the InfluxDB cluster.

Wrapping Python applications into self-extracting executable files using Pyinstaller

March 15, 2019

Ivan Kudriavtsev

Python, Pyinstaller, deployment, application delivery, SFX executable

In this short article, we will explore a simple way that allows you to distribute applications created in Python as thick, self-extracting archives that look like simple executable files and contain all the environment and dependencies necessary to run an application.

CloudStack-UI 1.411.29 is Out

March 13, 2019

Ivan Kudryavtsev

CloudStack, User Interface

The overview includes a description of the main enhancements in Release 1.411.29. The key feature of the iteration is the introduction of a new plugin that allows users and administrators to manage resource limits and quotas for accounts. Besides, we enhanced the Log View plugin and such interface components as snapshot management, UI settings, API key management, error messages displaying, Security Group management, Service Offering chooser, as well as fixed a range of bugs for the Pulse plugin and the whole system. Below you will find details on all improvements and fixes of this release.

CloudStack-UI 1.411.29 is Out

March 13, 2019

Ivan Kudryavtsev

CloudStack, User Interface

The overview includes a description of the main enhancements in Release 1.411.29. The key feature of the iteration is the introduction of a new plugin that allows users and administrators to manage resource limits and quotas for accounts. Besides, we enhanced the Log View plugin and such interface components as snapshot management, UI settings, API key management, error messages displaying, Security Group management, Service Offering chooser, as well as fixed a range of bugs for the Pulse plugin and the whole system. Below you will find details on all improvements and fixes of this release.

Recent Posts

How to Run Low-Latency Jobs With Apache Spark

Improving InfluxDB with Apache Kafka

Wrapping Python applications into self-extracting executable files using Pyinstaller

CloudStack-UI 1.411.29 is Out

CloudStack-UI 1.411.29 is Out