no code implementations • 22 Dec 2023 • Thorsten Wittkopp, Alexander Acker, Odej Kao
The realm of AIOps is transforming IT landscapes with the power of AI and ML.
no code implementations • 5 Dec 2023 • Qiao Yu, Wengui Zhang, Jorge Cardoso, Odej Kao
In this paper, we present a comprehensive study on the correlation between CEs and UEs, specifically emphasizing the importance of spatio-temporal error bit information.
1 code implementation • 5 Oct 2023 • Jiawen Xu, Claas Grohnfeldt, Odej Kao
In most works on deep incremental learning research, it is assumed that novel samples are pre-identified for neural network retraining.
no code implementations • 22 Aug 2023 • Dominik Scheinert, Philipp Wiesner, Thorsten Wittkopp, Lauritz Thamsen, Jonathan Will, Odej Kao
However, big data analytics jobs across users can share many common properties: they often operate on similar infrastructure, using similar algorithms implemented in similar frameworks.
1 code implementation • 24 May 2023 • Philipp Wiesner, Ramin Khalili, Dennis Grinwald, Pratik Agrawal, Lauritz Thamsen, Odej Kao
Federated Learning (FL) is an emerging machine learning technique that enables distributed model training across data silos or edge devices without data sharing.
no code implementations • 25 Jan 2023 • Thorsten Wittkopp, Dominik Scheinert, Philipp Wiesner, Alexander Acker, Odej Kao
Due to the complexity of modern IT services, failures can be manifold, occur at any stage, and are hard to detect.
no code implementations • 24 Nov 2022 • Dominik Scheinert, Babak Sistani Zadeh Aghdam, Soeren Becker, Odej Kao, Lauritz Thamsen
With increasingly more computation being shifted to the edge of the network, monitoring of critical infrastructures, such as intermediate processing nodes in autonomous driving, is further complicated due to the typically resource-constrained environments.
no code implementations • 15 Nov 2022 • Dominik Scheinert, Soeren Becker, Jonathan Bader, Lauritz Thamsen, Jonathan Will, Odej Kao
Choosing a good resource configuration for big data analytics applications can be challenging, especially in cloud environments.
no code implementations • 14 Nov 2022 • Soeren Becker, Kevin Styp-Rekowski, Oliver Vincent Leon Stoll, Odej Kao
Enabled by the increasing availability of sensor data monitored from production machinery, condition monitoring and predictive maintenance methods are key pillars for an efficient and robust manufacturing production cycle in the Industrial Internet of Things.
1 code implementation • 19 Jul 2022 • Houkun Zhu, Dominik Scheinert, Lauritz Thamsen, Kordian Gontarska, Odej Kao
Distributed file systems are widely used nowadays, yet using their default configurations is often not optimal.
1 code implementation • 7 Jul 2022 • Jasmin Bogatinovski, Gjorgji Madjarov, Sasho Nedelkoski, Jorge Cardoso, Odej Kao
Artificial Intelligence for IT Operations (AIOps) describes the process of maintaining and operating large IT systems using diverse AI-enabled methods and tools for, e. g., anomaly detection and root cause analysis, to support the remediation, optimization, and automatic initiation of self-stabilizing IT activities.
1 code implementation • 6 Apr 2022 • Jasmin Bogatinovski, Sasho Nedelkoski, Li Wu, Jorge Cardoso, Odej Kao
Our experimental results demonstrate that the learned subprocesses representations reduce the instability in the input, allowing CLog to outperform the baselines on the failure identification subproblems - 1) failure detection by 9-24% on F1 score and 2) failure type identification by 7% on the macro averaged F1 score.
1 code implementation • 6 Apr 2022 • Jasmin Bogatinovski, Sasho Nedelkoski, Alexander Acker, Jorge Cardoso, Odej Kao
We start with an in-depth analysis of quality log instruction properties in nine software systems and identify two quality properties: 1) correct log level assignment assessing the correctness of the log level, and 2) sufficient linguistic structure assessing the minimal richness of the static text necessary for verbose event description.
no code implementations • 26 Nov 2021 • Thorsten Wittkopp, Philipp Wiesner, Dominik Scheinert, Odej Kao
In this paper, we present a taxonomy for different kinds of log data anomalies and introduce a method for analyzing such anomalies in labeled datasets.
no code implementations • 20 Sep 2021 • Thorsten Wittkopp, Alexander Acker, Sasho Nedelkoski, Jasmin Bogatinovski, Dominik Scheinert, Wu Fan, Odej Kao
Furthermore, we utilize available anomaly examples to set optimal decision boundaries to acquire strong baselines.
1 code implementation • 27 Aug 2021 • Dominik Scheinert, Houkun Zhu, Lauritz Thamsen, Morgan K. Geldenhuys, Jonathan Will, Alexander Acker, Odej Kao
Distributed dataflow systems like Spark and Flink enable the use of clusters for scalable data analytics.
1 code implementation • 29 Jul 2021 • Dominik Scheinert, Lauritz Thamsen, Houkun Zhu, Jonathan Will, Alexander Acker, Thorsten Wittkopp, Odej Kao
First, a general model is trained on all the available data for a specific scalable analytics algorithm, hereby incorporating data from different contexts.
1 code implementation • 9 Mar 2021 • Dominik Scheinert, Alexander Acker, Lauritz Thamsen, Morgan K. Geldenhuys, Odej Kao
Operation and maintenance of large distributed cloud applications can quickly become unmanageably complex, putting human operators under immense stress when problems occur.
no code implementations • 23 Feb 2021 • Harold Ott, Jasmin Bogatinovski, Alexander Acker, Sasho Nedelkoski, Odej Kao
To that end, we utilize pre-trained general-purpose language models to preserve the semantics of log messages and map them into log vector embeddings.
no code implementations • 12 Feb 2021 • Soeren Becker, Florian Schmidt, Anton Gulenko, Alexander Acker, Odej Kao
Edge computing was introduced as a technical enabler for the demanding requirements of new network technologies like 5G.
no code implementations • 11 Feb 2021 • Morgan Geldenhuys, Lauritz Thamsen, Odej Kao
However, this is an expensive operation which impacts negatively on the overall performance of the system and manually optimizing fault tolerance for specific jobs is a difficult and time consuming task.
Distributed, Parallel, and Cluster Computing
no code implementations • 11 Feb 2021 • Morgan Geldenhuys, Lauritz Thamsen, Kain Kordian Gontarska, Felix Lorenz, Odej Kao
Distributed stream processing has become key to analyzing data generated by these connected devices and improving our ability to make decisions.
Distributed, Parallel, and Cluster Computing
no code implementations • 27 Jan 2021 • Sabtain Ahmad, Kevin Styp-Rekowski, Sasho Nedelkoski, Odej Kao
We demonstrate the effectiveness of the proposed method by employing two rotating machine datasets and the quality of the automatically learned features is compared with a set of handcrafted features by training an Isolation Forest model on either of these two sets.
no code implementations • 25 Jan 2021 • Kevin Styp-Rekowski, Florian Schmidt, Odej Kao
Forecasting of time series in continuous systems becomes an increasingly relevant task due to recent developments in IoT and 5G.
no code implementations • 15 Jan 2021 • Jasmin Bogatinovski, Sasho Nedelkoski, Alexander Acker, Florian Schmidt, Thorsten Wittkopp, Soeren Becker, Jorge Cardoso, Odej Kao
Finally, all this will result in faster adoption of AIOps, further increase the interest in this research field and contribute to bridging the gap towards fully-autonomous operating IT systems.
1 code implementation • 8 Sep 2020 • Sasho Nedelkoski, Mihail Bogojeski, Odej Kao
Through several experiments, we demonstrate that the model improves on state-of-the-art multimodal methods based on variational inference on various computer vision tasks such as colorization, edge and mask detection, and weakly supervised learning.
no code implementations • 21 Aug 2020 • Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, Odej Kao
We propose Logsy, a classification-based method to learn log representations in a way to distinguish between normal data from the system of interest and anomaly samples from auxiliary log datasets, easily accessible via the internet.
1 code implementation • 7 Jul 2020 • Alexander Acker, Thorsten Wittkopp, Sasho Nedelkoski, Jasmin Bogatinovski, Odej Kao
First, KPI types like CPU utilization or allocated memory are very different and hard to be expressed by the same model.
2 code implementations • 17 Mar 2020 • Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, Odej Kao
This allows the coupling of the MLM as pre-training with a downstream anomaly detection task.