PinnedEfficient SIEM and Detection Engineering in 10 stepsSIEM systems and detection engineering are not just about data and detection rules. Planning and processes are becoming increasingly…Mar 25, 20233Mar 25, 20233
10 most important MITRE ATT&CK sources in one click using PandasMITRE ATT&CK is a source of knowledge about adversarial tactics and techniques. It is a common domain language in the world of cyber…Apr 5, 2023Apr 5, 2023
How To Clean Data with Python Pandas — Vehicles registered in PolandThanks to the Open Data project, we have sources made available by Polish public entities. In this article, we will prepare and clean the…Mar 19, 2023Mar 19, 2023
ksqlDB —real-time SQL magic in the cybersecurity scenario— part 1ksqlDB is a solution from the Apache Kafka and Confluent family. It allows you to use SQL to define stream processing jobs. This story…Feb 4, 20221Feb 4, 20221
Change Data Capture — Convert your database into a stream with DebeziumHave you ever thought about creating a stream from database operations? In this story, you will learn what Change Data Capture is and how…Jan 30, 2021Jan 30, 2021
How to use Variables and XCom in Apache Airflow?It is said that Apache Airflow is CRON on steroids. It is gaining popularity among tools for ETL orchestration (Scheduling, managing and…Dec 11, 2020Dec 11, 2020
Readable Scale Code in Apache Spark (4 attempts)Jupyter and Apache Zeppelin is a good place to experiment with data. Unfortunately, the specifics of notebooks do not encourage to…Oct 31, 20203Oct 31, 20203
Published inThe StartupTwitter Data Analysis for the Lazy in Elastic Stack (Xbox VS PlayStation)Twitter data can be obtained in many ways, but who wants to write the code 😉. Especially one that will work 24/7. In Elastic Stack you…Oct 18, 2020Oct 18, 2020
Published inITNEXTKafka Connect in a nutshellKafka Connect is part of the Apache Kafka platform. It is used to connect Kafka with external services such as file systems and databases…Oct 6, 2020Oct 6, 2020
Published inITNEXT5 Pitfalls of NoSQL DatabasesI recorded a video in which I talk about the advantages of NoSQL databases. The response was interesting, but I had the impression that…Sep 20, 20204Sep 20, 20204
Published inITNEXTPySpark ETL from MySQL and MongoDB to CassandraIn Apache Spark/PySpark we use abstractions and the actual processing is done only when we want to materialize the result of the operation…Sep 14, 20201Sep 14, 20201
Published inITNEXTHow to Elastic SIEM (part 2)This is a continuation of the previous story. This time we will look at the Detections tab in Elastic SIEM. Our goal is to automate IOC…Aug 29, 20202Aug 29, 20202
Published inITNEXTHow to Elastic SIEM (part 1)IT environments are becoming increasingly large, distributed and difficult to manage. All system components must be protected and…Aug 20, 2020Aug 20, 2020
Published inITNEXTHow to set up local Apache Spark environment (5 ways)Apache Spark is one of the most popular platforms for distributed data processing and analysis. Although it is associated with a server…Aug 15, 2020Aug 15, 2020
Published inITNEXTHow To Start with Apache Spark and Apache CassandraApache Cassandra is a specific database that scales linearly. This has its price: specific table modelling, configurable consistency and…Jul 23, 2020Jul 23, 2020
Published inPython in Plain EnglishBig Data without Hadoop/HDFS? MinIO tested on Jupter + PySparkThe takeover of Hortonworks by Cloudera ended the free distribution of Hadoop. Therefore, a lot of people are looking for alternative…Jul 15, 20201Jul 15, 20201
How to provide failover for Logstash or other log collector using keepalivedWhen planning the system we take into account possible failures (Design for Failure). In the case of log aggregation we use solutions such…Jul 7, 2020Jul 7, 2020
Published inPython in Plain EnglishKoalas, or PySpark disguised as PandasOne of the basic Data Scientist tools is Pandas. Unfortunately, the excess of data can significantly ruin our fun. That is why Koalas was…Jul 2, 20201Jul 2, 20201
Published inThe StartupDoes Elasticsearch lie? How does Elasticsearch work?Elasticsearch surprises us with its capabilities and speed of action, but does it return the correct results? In this post, you’ll learn…Jun 23, 2020Jun 23, 2020