Positive Thinking Company

Hello Camunda 8

Melissa Rebeyrolle — Mon, 15 Jan 2024 17:01:41 +0000

Camunda Platform 8 is a scalable, resilient and messaging based process automation system. The main difference between Camunda 7 and 8 is the fact that Camunda Platform 8 is designed as a Software as a Service (SaaS) solution and works as a remote workflow engine, which is based on the open source project zeebe.

Summary

The use cases for Camunda 8 Platform are described here which outlines the following:

Orchestrate Human Tasks
Orchestrate, Observe and Analyze Microservices
Take Control of Your RPA Bots
Build a Centralized Process Automation Platform
Modernize Legacy IT Systems
Replace Homegrown Workflow Solutions

While this article contains code snippets in Kotlin, you don’t need to be a Kotlin developer to be successful. Camunda Platform 8 is supporting other programming languages. Jump forward to see a list of supporting languages & frameworks here.

Camunda Components & Architecture

Here is an overview of all components that exist for Camunda Platform 8. You can click on the links to get more details:

Modeler – The Modeler is used to design and deploy a BPMN process. There is two different versions. The first one is Web Modeler which is part of the SaaS solution. The second version is the Desktop Modeler. You can image, it’s a desktop application which can be run on Windows, MacOS and Linux.
Workflow Engine (Zeebe Engine) – Zeebe is the process automation engine powering Camunda Platform 8.
Console – Create, configure, manage, and monitor clusters for your environments.
Tasklist – Is a ready-to-use web application that allows users to work on assigned tasks.
Operate – This tool is created for monitoring and troubleshooting process instances. It provides transparency and real-time visibility to monitor, analyse, and resolve problems.

Optimize – Offers business intelligence tooling for Camunda customers. By leveraging data collected during process execution, you can access reports, share process intelligence, analyse bottlenecks, and examine areas in business processes for improvement.

Zeebe clusters are too complex to explain it in this article. At this moment it’s sufficient to know that a cluster handles the scalability of your workflow engine. If you are interested in more details, I recommend you read the following articles:

Zeebe clients

The clients give you the possibility to control your process instances and run your tasks. They are connected to the Camunda Platform via gRPC, which allows to use different languages and frameworks. It’s also possible to create a polyglot architecture, so the choice is yours . Currently, Camunda supports officially these three clients:[MR1] [AH2]

In addition to the official clients there are community clients out there for C#, JavaScript/NodeJS, Micronaut, Python, Ruby, Rust, Spring, Quarkus. Some of them are wrappers around the official clients (e.g. Spring or Micronaut).

Zeebe clients

You got a really rough introduction into Camunda and learn some technical key facts about it. Now we are able to start our first lab to get a feeling about the interaction between your own code and the workflow engine.

Take advantage of Camunda 8 Platform

Prerequisites

Java/Kotlin
Gradle
IDE (IntelliJ, Eclipse, VSCode, or similar)

Registration

To skip the technical setup and take advantage of the SaaS solution, we start with the registration to the Camunda 8 Platform with the account you can create clusters, deploy process, and create a new instance.

1. Visit camunda.io/signup and view the Sign Up screen:

2. Fill out the form and submit. After that you’ll receive a confirmation email. Click on the link to verify your email address and set your password.

3. After the login, you’ll see the console overview page. This is the central place to manage your clusters, and the diagrams and forms you want to deploy to Camunda Platform 8.

Orchestrate your first BPMN process with Camunda Platform 8

Design and deploy a process

Let’s design and deploy your first BPMN process including a service task. This example will help you to understand how you can start your microservice orchestration.

1. Open the Web Modeler in a new tab by clicking on Modeler in the navigation bar.

2. Create a New project and select New > BPMN Diagram. You can rename your project and diagram by clicking on the navigation item and select Edit name

3. Give your model a descriptive name and id within the General tab inside the detail panel on the right side of the screen. We’ll use Service-Task-Example for the name and service-task-example for the id.

4. Use the Web Modeler to design a BPMN process with a service task. You can select the Start Event and click on the task icon on the context palette to append a task. Click the wrench icon and select service task to change the task type.

5. Add a descriptive name using the details panel. For this example, we’ll use Microservice Example. After that expand the Task definition section and use orchestrate-something as Type. This value is necessary to connect the service task to the corresponding microservice code.

6. Finish your first model by appending an EndEvent.

7. Deploy your diagram by clicking on the Deploy diagram button. May you need to create a cluster, in this case please follow the instruction in the section Create a cluster and credentials and come back after completion.

8. Start a new process instance by clicking on the Start Instance button.

9. Navigate to Operate by clicking on the honeycomb icon > View process instances.

10. You’ll see your process instance with a token waiting at the service task.

Create a cluster and credentials

To deploy and run a process, you need to create a cluster. To connect a worker to a service task you need to create client credentials as well:

1. Create a cluster by clicking on Deploy diagram > create a new cluster. Name your cluster My first Cluster and Create cluster.

2. The creation will take a few moments. Once the cluster is healthy, you’re able to deploy the diagram.

3. Switch back to your other Camunda tab. Navigate to clusters > My first cluster > API and click Create your first Client. Provide a descriptive name for you client like microservice-worker. For this How-To you need to select Zeebe as scope. Copy or download the credentials after client creation. Once you close the window you will not be able to access the generated client secret.

Create a service task

Next, you’ll create a worker for the service task and connect it with your BPMN process you created in the previous section.

1. Create a new Spring-Boot project with Gradle and add implementation(“io.camunda:spring-zeebe-starter:8.0.9”) as dependency to your build.gradle.kts.

2. Add your copied credentials to application.properties

3. Copy the following code snippet to your project:

import io.camunda.zeebe.client.api.response.ActivatedJob

import io.camunda.zeebe.spring.client.EnableZeebeClient

import io.camunda.zeebe.spring.client.annotation.ZeebeWorker

import org.springframework.boot.autoconfigure.SpringBootApplication

import org.springframework.boot.runApplication

@SpringBootApplication

@EnableZeebeClient

class Application {

@ZeebeWorker(type = "orchestrate-something", autoComplete = true)

fun orchestrateSomething(job: ActivatedJob) {

println("Congratulations, you created a worker!")

}

fun main(args: Array) {

runApplication(*args)

}

The class annotation @EnableZeebeClient loads the necessary Zeebe client configuration for spring.
The method annotation @ZeebeWorker(type = “orchestrate-something”, autoComplete = true) defines a worker, which requests jobs on a regular interval for the task orchestrate-something.

4. Now you can run the application. You should see the message Congratulations, you created a worker! in your output stream.

5. Navigate to Operate, and you’ll see your token has moved to the end event, completing this process instance.

Congratulations! You successfully design, deploy and orchestrate your first BPMN process with Camunda Platform 8.

Read more Insights on Software Product Engineering

The post Hello Camunda 8 appeared first on Positive Thinking Company.

Cloud & Sustainability: How to make your cloud usage more sustainable?

Alexandra Dolbeau — Mon, 18 Dec 2023 16:09:13 +0000

"Unlock the potential of sustainable and cost-efficient cloud computing with our white paper on the convergence of FinOps and GreenOps. This guide is a must-read for forward-thinking IT and business leaders seeking to innovate responsibly in the cloud era."

The post Cloud & Sustainability: How to make your cloud usage more sustainable? appeared first on Positive Thinking Company.

Creating Native Images in Spring Boot

Melissa Rebeyrolle — Thu, 07 Dec 2023 15:35:35 +0000

In the ever-evolving landscape of Java development, one of the exciting advancements is the ability to create native images of Spring Boot applications. Native images offer a range of benefits, including lightning-fast startup times and reduced memory consumption, making them ideal for serverless functions, microservices, and containerized applications. However, achieving native image compatibility requires overcoming Java’s dynamic nature, a task expertly tackled by GraalVM and Spring Native.

In this article, we will delve into the intricacies of creating native images in Spring Boot. We’ll walk you through the essential steps and considerations that developers need to keep in mind. By the end of this journey, you’ll have a solid understanding of how to optimize your Spring Boot applications for native image compilation.

Summary

Metadata for Native Image Compilation

Java’s dynamic nature, characterized by features like reflection and dynamic proxies, presents a unique challenge when aiming to build native images. Native image compilation essentially freezes the application’s state at build time, and any dynamic behaviors must be explicitly configured. This is where metadata files come into play.

To bypass the dynamic aspects of Java, you’ll need to provide metadata to GraalVM about your application’s behavior. This metadata informs GraalVM about what classes, methods, and resources should be included in the resulting native image. This is achieved through the use of configuration files, typically written in JSON format, which are placed in the META-INF/native-image// folder of your project.

For instance, let’s say your Spring Boot application uses reflection to access classes at runtime. Without proper configuration, GraalVM would have no knowledge of which classes are being accessed dynamically, making them unavailable in the native image. By crafting reflection configuration files, you explicitly declare which classes should be considered during native image compilation, effectively guiding GraalVM in capturing the necessary metadata.

Configuration Files for Native Image Compilation

In the native image compilation process, a key challenge is taming Java’s dynamic nature. GraalVM requires explicit instructions to include classes and resources in the resulting native image. This is accomplished through various configuration files placed in the META-INF/native-image folder of your project. Let’s explore the essential types of configuration files.

Reflections Configuration

Reflection is a dynamic Java feature commonly used in frameworks like Spring. It allows you to inspect and interact with classes and methods at runtime. To ensure GraalVM includes the necessary elements in the native image, create a Reflections Configuration file (e.g., reflect-config.json). This file specifies which classes, methods, and fields should be accessible through reflection.

Example reflect-config.json:

[ { "condition": { "typeReachable": "" }, "name": "", "methods": [ { "name": "", "parameterTypes": [""] } ], "queriedMethods": [ { "name": "", "parameterTypes": [""] } ], "fields": [{ "name": "" }], "allDeclaredClasses": true, "allDeclaredMethods": true, "allDeclaredFields": true, "allDeclaredConstructors": true } ]

Dynamic Proxies Configuration

Dynamic proxies enable advanced Java features like AOP. To handle dynamic proxies correctly in the native image, create a Dynamic Proxies Configuration file (e.g., proxy-config.json). Specify interfaces and classes that should be proxied dynamically to retain dynamic proxy-related functionality.

Example proxy-config.json:

[
{
    “condition”: {
      “typeReachable”: “”
    },
    “interfaces”: [“IA”, “IB”]
}
]

JNI, Resources, and Serialization Configuration

Beyond reflections and dynamic proxies, there are configuration files for other dynamic aspects:

JNI Configuration: Handles native method calls.
Resources Configuration: Ensures non-class resources are embedded.
Serialization Configuration: Manages custom serialization.

Each of these configuration files contributes to a comprehensive strategy for native image compilation.

Third-Party Library Configuration Files

As you embark on the journey of creating native images in Spring Boot, it’s important to note that you’re not alone in this endeavor. Many third-party libraries and frameworks have recognized the importance of native image compatibility and have taken steps to provide their own configuration files.

Out-of-the-Box Compatibility: Some popular third-party libraries and frameworks, especially those commonly used in the Java ecosystem, have already configured their libraries for native image compilation. This means that when you include these libraries in your project, they come equipped with the necessary metadata for GraalVM. Examples include Spring Framework, Hibernate, and Apache Tomcat.
Documentation: When integrating third-party libraries, consult their documentation to understand if they provide specific configuration files or guidelines for native image compatibility. Following their recommendations can save you time and effort in configuring these libraries yourself.

GraalVM’s Tracing Agent

To further streamline the process of preparing your Spring Boot application for native image compilation, GraalVM offers a powerful tool known as the Tracing Agent. This agent assists in gathering metadata throughout your project, aiding in the creation of the required configuration files.

Here’s how the Tracing Agent works:

Meta Information Gathering: The Tracing Agent collects information about classes, methods, and resources used during your application’s runtime.
Configuration File Generation: Based on the gathered data, the Tracing Agent helps generate configuration files for reflections, dynamic proxies, JNI, resources, and serialization.
Iterative Process: To improve coverage and accuracy, it may be necessary to run your application multiple times with the Tracing Agent enabled. Each run captures additional metadata, refining the configuration files.

While the Tracing Agent is a valuable tool, it’s essential to understand that it may not cover all aspects of your application’s dynamic behavior. Some manual intervention in configuring these files may still be required to ensure optimal native image generation.

Addressing Limitations of the Tracing Agent

While the Tracing Agent is a powerful tool for gathering metadata and automating the creation of configuration files, it’s important to understand that it may not cover all aspects of your application’s dynamic behavior. There are certain limitations and considerations to keep in mind:

Incomplete Coverage: The Tracing Agent’s coverage depends on the execution paths exercised during its runtime analysis. It may not capture all possible scenarios, especially those involving conditional logic or rarely executed code paths. Therefore, manual intervention may still be necessary to configure specific cases.
Custom Reflections: If your application employs custom reflection mechanisms or uses reflection in complex ways, the Tracing Agent might not fully comprehend these custom behaviors. You’ll need to provide additional configuration for such cases.
Dynamic Class Loading: Classes loaded dynamically at runtime, a common practice in certain frameworks, may require manual configuration. The Tracing Agent may not always detect these dynamic loading patterns.
Advanced Use Cases: In some advanced use cases, such as bytecode manipulation or unconventional reflection patterns, the Tracing Agent’s automated approach may fall short. In such scenarios, meticulous manual configuration is indispensable.

The GraalVM Reachability Metadata Repository

The GraalVM Reachability Metadata Repository is a valuable resource in the quest for native image compatibility. It serves as a community-driven hub where developers share their configurations for libraries commonly used in the Java ecosystem. Spring Boot, by default, takes advantage of this repository.

Here’s how it works:

Dependency Analysis: When your Spring Boot project includes dependencies, Spring will automatically query the GraalVM Reachability Metadata Repository for the required configurations.
Community Wisdom: By utilizing this repository, you benefit from the collective knowledge of the Java community. It simplifies the native image preparation process, as you can rely on configurations shared by others.
Regular Updates: The repository is frequently updated with new configurations and improvements, ensuring ongoing support for a wide range of libraries.

Conclusion

In conclusion, creating native images in Spring Boot offers substantial performance benefits, but it comes with unique challenges due to Java’s dynamic nature. To navigate these challenges successfully:

Use a combination of automated tools like the Tracing Agent and manual configuration to cover all aspects of your application’s behavior.
Leverage native image compatibility features provided by third-party libraries whenever possible.
Take advantage of the GraalVM Reachability Metadata Repository to tap into the collective wisdom of the Java community.

By following these strategies, you’ll be well-equipped to optimize your Spring Boot project for native image compilation, delivering faster startup times and reduced memory consumption for your applications. Happy native image building!

Read more Insights on Software Product Engineering

The post Creating Native Images in Spring Boot appeared first on Positive Thinking Company.

Generative AI Strategy: From Understanding to Actionable Roadmap

Cédric Antonini — Fri, 01 Dec 2023 17:17:17 +0000

Access this on-demand webinar to explore a structured approach to experimenting with Generative AI and building an effective strategy. Together with some of our top experts in the field, you will learn the key practical and strategic layers of Generative AI and discover a robust framework to support your initiatives.

The post Generative AI Strategy: From Understanding to Actionable Roadmap appeared first on Positive Thinking Company.

Achieving ESG Reporting: How SMEs Can Benefit from NLP

Cédric Antonini — Fri, 01 Dec 2023 16:47:32 +0000

In corporate responsibility, the integration of NLP in CSRD/ESG reporting is emerging as a transformative approach. This article explores the multifaceted applications of NLP, shedding light on its potential to navigate the complexities of ESG data, enhance reporting efficiency, and contribute to more comprehensive and accurate sustainability reports.

From data identification and collection to processing, enrichment, and report creation, we dive into the ways NLP can be harnessed to address the challenges and opportunities inherent in CSRD/ESG reporting. And as we navigate through the diverse applications of NLP, we will also address the ethical considerations, limitations, and future directions of this technology in the sustainable reporting domain. The journey ahead illuminates the possibilities and challenges, offering insights and recommendations for companies seeking to leverage AI to transform the future of business.

First Thing First: Definitions and Context

What is ESG/CSRD Reporting?

Environmental, Social, and Governance (ESG) reporting, recently enhanced by the Corporate Sustainability Reporting Directive (CSRD) in the European Union, is a framework for companies to disclose their impact and practices related to environmental conservation, social responsibility, and governance structures. The importance of complying with this directive stems from the increasing emphasis on corporate responsibility and sustainability, with stakeholders, investors, and consumers demanding greater transparency and accountability from companies.

The journey towards effective CSRD/ESG reporting presents both challenges and opportunities for companies. Challenges include ensuring data availability, maintaining data quality, achieving standardization, and enhancing comparability and analysis across different companies and sectors. However, overcoming these challenges opens up opportunities for companies to improve their sustainability practices, enhance their reputation, attract investment, and drive long-term value creation, within our planet’s boundaries.

What is NLP?

Natural Language Processing (NLP) is a field of artificial intelligence (AI) that focuses on the interaction between computers and humans through natural language. It enables machines to understand, interpret, and generate human language, thereby facilitating more intuitive and meaningful interactions. The evolution of this Machine Learning (ML) technique has been marked by the introduction of training methods, the transformers architecture, the scalability of models through improvements in processing (e.g. GPUs), and, more recently, the advent of Generative AI applications.

NLP, particularly with advancements like Large Language Models (LLMs) and Data-centric AI, faces its own set of challenges and opportunities. The challenges revolve around ensuring the ethical use of technology, managing data privacy, mitigating biases, and preventing potential mistakes and hallucinations in language models. However, the opportunities are vast, including automating content creation, enhancing customer interactions, extracting insights from unstructured data, and more.

Bridging ESG Reporting and NLP

In conclusion, the convergence of ESG reporting and NLP can be a game-changer for small and medium-sized companies. NLP, with its ability to analyze, interpret, and generate human language, can help companies navigate the complexities of CSRD/ESG reporting by automating data extraction, enhancing report quality, and providing insights for better sustainability practices.

Let’s explore each of these NLP techniques in detail.

NLP for ESG Data Identification

The Challenge: Diverse and Scattered ESG Data

In the realm of ESG reporting, companies grapple with the challenge of diverse and scattered data sources. ESG data can be hidden in a myriad of formats, scattered across various assets, both digital and physical, necessitating a sophisticated approach for accurate identification and extraction.

The Solution: NLP’s Advanced Capabilities

NLP emerges as a transformative solution, capable of scanning a multitude of information sources such as company documents, websites, news articles, and social media posts. It delves into the vast sea of unstructured text, pinpointing key themes, topics, metrics, and indicators essential for comprehensive ESG reporting.

Innovative Models and Techniques

Semantic Search Model: Our team has developed a semantic search model that can sift through over 12,000 existing ESG metrics, aiding companies in finding the most relevant metrics for specific search topics.

ESGBert: Another innovation is ESGBert, a model adept at classifying text segments based on their relevance to distinct ESG subtopics, providing a granular and nuanced view of ESG data.
Named Entity Recognition (NER): NLP utilizes NER to identify entities like companies, numbers, dates, and measurements within the text, enriching the ESG data pool.
Topic Modeling and Sentiment Analysis: These techniques are employed to gauge the sentiment and thematic context of the content, offering deeper and more insightful perspectives into the ESG landscape.

Aligning with Regulatory Standards

NLP plays a pivotal role in aligning companies’ ESG data with the European Sustainability Reporting Standards (ESRS). By mapping identified data to these standards and considering the double materiality perspective, companies ensure their reporting is both comprehensive and compliant.

In short: NLP – A Valuable Tool for ESG Data Identification

In addressing the challenges associated with diverse and scattered ESG data, NLP proves to be a valuable tool. It offers significant capabilities to aid companies in navigating the complexities of ESG data identification. By leveraging AI, companies can enhance their ability to align with regulatory standards and make strides toward fostering a more sustainable and responsible business ecosystem. The integration of NLP in ESG reporting is a step forward, but it is essential for companies to continue exploring and adopting a multifaceted approach – meeting the evolving demands of sustainability and compliance.

NLP for ESG Data Collection

The Challenge: Varied Sources and Formats

In the pursuit of comprehensive ESG reporting, companies are often confronted with the task of collecting and aggregating data from a wide array of sources and formats, such as PDFs, HTMLs, CSVs, etc. The diversity in data types and the sheer volume of information necessitate efficient and accurate techniques for data collection.

The Solution: NLP Diverse Techniques

NLP offers a suite of techniques, including web scraping, document parsing, and data extraction, to facilitate the collection and aggregation of ESG data from various sources and formats. AI not only aids in accessing information but also ensures that the data integrated into the reporting is relevant and accurate.

Practical Applications and Expertise

Semantic Search and Question Answering System: We have developed a system that employs semantic search and question-answering to aid companies in finding and extracting ESG data from extensive documents, PDFs, email communications, and other written records.

LLM Question Answering NLP for ESG CSRD insights

Standardized Data Aggregation: Our expertise extends to standardized data aggregation, enabling companies to amalgamate ESG data from diverse sources and formats into a unified and consistent representation, thereby enhancing the coherence and reliability of the reporting.

Ensuring Quality and Traceability

Maintaining the quality and traceability of ESG data is paramount in building credibility in ESG reporting and complying with regulatory standards. Embracing a hybrid approach that combines ML/AI with human expertise, particularly in scenarios demanding high data quality, can significantly enhance the validation, verification, and cleaning processes.

This human-in-the-loop system ensures that a human expert reviews and validates the results produced by the AI models, thereby fortifying the integrity of the data. Such an integration of technologies and human oversight fosters trust and reliability in the ESG reporting process, ensuring that the data not only meets regulatory standards but also upholds the highest quality benchmarks.

Semantic Search Example NLP for ESG insights

In short: NLP – Enhancing ESG Data Collection

While AI is not the sole solution, it serves as a valuable asset in the ESG data collection process. It offers companies the tools to efficiently access, integrate, and validate data from a multitude of sources and formats. By leveraging NLP, companies can enhance the quality and traceability of their ESG data, contributing to more accurate and credible reporting, and ultimately, advancing sustainability goals.

NLP for ESG Data Processing or Enrichment

The Challenge: Analyzing and Enhancing ESG Data

The task of analyzing and enhancing ESG data is a critical step in ESG reporting. It involves extracting meaningful insights from the collected data and presenting it in a manner that is both informative and accessible. The complexity and volume of ESG data requires sophisticated techniques for effective processing and enrichment.

The Solution: NLP’s Analytical Techniques

NLP offers a range of analytical techniques such as text summarization, text classification, sentiment analysis, and text clustering to analyze and enhance ESG data. These techniques enable companies to distill the essence of the data, categorize it effectively, gauge sentiment, and group similar data points, thereby optimizing the data for further research and processing.

Practical Applications and Innovations

Text Summarization: We have leveraged NLP to summarize the main points and trends of ESG data, facilitating faster research and optimized processing.
Sentence Embeddings: By converting text to sentence embeddings and storing them in a vector database, we have enabled fast and easy retrieval of relevant sentences based on semantic similarity.
Sentiment Analysis: We have utilized NLP to observe the sentiment fluctuations regarding companies in news articles over time, providing an additional layer of insight into the public perception of ESG-related events and developments.
Text Classification: NLP enables the classification of text segments into different ESG topics, such as climate change, human rights, diversity and inclusion, etc., facilitating easier filtering and analysis of relevant information.

In short: NLP – Elevating ESG Data Processing and Enrichment

NLP stands as a powerful tool in the realm of ESG data processing and enrichment. By employing a variety of analytical techniques, it enables companies to extract meaningful insights, optimize data accessibility, and enhance the overall quality of ESG reporting. Its integration in this phase is instrumental in advancing the depth and breadth of insights derived from ESG data, contributing to more informed decision-making and strategic planning for sustainability.

NLP for CSRD Report Creation

The Challenge: Crafting Comprehensive CSRD Reports

Creating CSRD reports is a meticulous task that demands precision, accuracy, and a comprehensive representation of a company’s ESG initiatives. The challenge lies in compiling vast amounts of ESG data into coherent, reliable, and informative reports that comply with regulatory standards.

The Solution: NLP’s Generative Capabilities

AI offers innovative solutions for crafting CSRD reports. It enables companies to leverage pre-trained models and existing reports to generate text fragments and suggestions, utilizing techniques such as natural language pre-training, fine-tuning, and generation. Importantly, NLP can integrate a human-in-the-loop approach, ensuring that the generated content is supervised and refined by human expertise.

Practical Applications and Innovations

Pre-Training Data: We have scraped sustainability reports from over 2000 companies (2022), which serve as a rich source of pre-training data. This data is instrumental when creating a CSRD report, aiding in the report drafting process, and improving the whole model to perform all required tasks.

LLMs for Natural Language Generation: Leveraging LLMs, we helped users in drafting answers or responses based on underlying ESG data. This application of NLP enhances the efficiency and coherence of the report creation process.

Banner White Paper Data-centric AI for Natural Language Processing NLP

Ensuring Accuracy and Reliability

NLP also plays a vital role in ensuring the accuracy and reliability of CSRD reports. By analyzing and refining the generated text, and by maintaining a humans-in-the-loop approach, it contributes to the creation of reports that are not only compliant with regulations but also reflective of the true ESG initiatives of the company.

In short: NLP – Aiding in the Creation of Reliable CSRD Reports

AI emerges as a valuable ally in the creation of CSRD reports. Its generative capabilities, coupled with human oversight, facilitate the crafting of comprehensive and accurate reports. By leveraging NLP, companies can streamline the report creation process, ensure compliance, and accurately represent their commitment to environmental, social, and governance principles.

Harnessing NLP for Enhanced CSRD/ESG Reporting

In this exploration of AI’s role in CSRD/ESG reporting, we’ve delved into its multifaceted applications, spanning from data identification to report creation. The potential of NLP is indisputable, especially when navigating the complexities and diversity of ESG data, offering a pathway to more coherent and comprehensive reporting.

Key Findings: NLP stands as a pivotal tool, enhancing efficiency in data collection, providing deeper insights through data processing, and aiding in the crafting of accurate reports. Its applications are diverse, including text summarization, sentiment analysis, data validation, and natural language generation.
Benefits and Advantages: The integration of NLP brings forth numerous benefits for companies, particularly in aligning with regulatory standards, ensuring data quality, and accurately representing sustainability initiatives.
Recommendations for SMEs: Small and medium-sized companies are advised to start with specific applications, maintain human oversight, leverage pre-trained models, and prioritize data quality.

However, the journey with NLP is not without its challenges. Ethical considerations, data privacy, hallucinations (especially when using LLMs) and potential biases are hurdles to be addressed, necessitating a cautious and ethical approach to AI integration.

Looking forward, the horizon is ripe with opportunities for further research and development. The exploration of ethical AI, refinement of language models, and advancements in data quality management are just a few avenues that hold promise for elevating AI’s role.

The intersection of NLP and CSRD/ESG reporting holds significant promise. By embracing best practices, acknowledging limitations, and pursuing continuous innovation, companies can harness the capabilities of AI to contribute to a more sustainable responsible business future.

Taking Information Access Experience to a New Level with a Generative AI Powered Chatbot

Cédric Antonini — Wed, 22 Nov 2023 18:34:00 +0000

As technology continually transforms our methods of interaction and information access, a groundbreaking project has emerged as a beacon of innovation in Europe. This case study explores the development of an AI-driven interactive chat interface, aimed at enhancing the way individuals engage with a wide range of information and opportunities. Central to this transformation is a chatbot powered by advanced language models, offering an information retrieval more intuitive, engaging, and efficient way of information retrieval which is particularly appealing to the digitally adept younger audience.

Context & Challenges

In the rapidly evolving landscape of online information dissemination, our client has been instrumental in providing a key resource for individuals seeking comprehensive knowledge and guidance. Their platform, primarily a web-based knowledge hub, has been crucial in offering detailed insights across various domains. However, with the progression of the digital age, user needs and behaviors, particularly among the younger demographic, are swiftly evolving.

This evolution presented our client with distinct challenges and opportunities:

Intuitive Information Retrieval Beyond Basic Search: The primary challenge was the existing mode of information retrieval on their platform. While the website’s traditional search system based on keywords was functional, it lacked the intuitive and interactive elements modern users, especially the younger generation, expect. These users seek instant, accurate, and context-rich responses, which a basic keyword search system struggles to provide. Recognizing the need to innovate and adapt to these changing preferences was crucial for maintaining the platform’s relevance and effectiveness.
Balancing Technological Advancements with Core Values: Given the nature of our client’s operations, integrating technological innovations like a novel chat interface had to align with their fundamental values of inclusivity and public service. The solution needed to be robust, accessible, and capable of handling diverse queries with precision and empathy.
Simplifying Data Complexity for User-Friendly Access: Another significant challenge was managing the extensive, yet fragmented data hosted on the website. The goal was to transform this wealth of information into a more accessible and user-friendly format, enabling users to easily find what they need and uncover new opportunities.

At this critical juncture, our client recognized the need to enhance their knowledge portal, making it more engaging and effective for a younger audience. This meant transitioning from a basic keyword search to a more advanced, semantically driven search approach. The focus was not just on technological upgrades but also on improving user engagement and satisfaction, ensuring the platform continued to be a vital resource in its domain.

Our approach was to develop an innovative, user-centric solution that bridged the gap between advanced technology and user-friendly interfaces, tailoring it effectively to the platform’s diverse user base.

LinkedIn Banner Speakers Generative AI Strategy Webinar Positive Thinking Company

Our Approach

To modernize our client’s information dissemination platform, we focused on three core strategies: implementing an advanced chat interaction system, designing a scalable and flexible solution, and conducting a detailed analysis of various Large Language Model (LLM) APIs. This approach was tailored to create an intuitive and engaging platform, ensuring technological robustness and adaptability for future developments.

Implementing an Advanced Chat Interaction System

We recognized the need for a sophisticated, user-friendly interface and chose to implement an advanced chat interaction system. This system uses the latest in language model technology to provide efficient and accurate responses to user queries. It combines retrieving relevant information from a comprehensive knowledge base via Retrieval Augmented Generation (RAG) with the generative capabilities of language models. This ensures that the chat interface not only fetches pertinent data but also presents it in a conversational, user-friendly manner. By grounding responses in the existing knowledge base, we ensured relevance, precision, and depth in the chatbot’s answers.

Designing a Scalable and Flexible Solution

In an environment where future-proofing and adaptability are key, we focused on creating a scalable and flexible solution. This approach ensures that the chat system is not limited to any specific server architecture, offering greater flexibility and scalability. We carefully considered the hardware and security requirements, aligning the solution with stringent data protection and privacy standards. This design also facilitates easier integration with future software architectures, ensuring long-term relevance and adaptability.

Comprehensive Analysis of LLM APIs

A critical aspect of our strategy was the thorough analysis of various LLM APIs, both open-source and commercial. We evaluated factors like cost, response time, quality of responses, and ease of integration. Our goal was to provide the client with the necessary insights to choose the most suitable LLM API, balancing cost-effectiveness with performance. This analysis was crucial in selecting an LLM API that met the client’s specific needs in terms of accuracy, speed, and integration with their existing technology.

End-to-End Evaluation of the LLM Application

To effectively optimize the interface, it is necessary to be able to select meaningful metrics, which capture multiple aspects of the interface quality and have high correlation with user acceptance. We used a multi-step evaluation strategy using synthetic questions and questions and answers provided by testers as a reference set. Based on this reference set we computed metrics like BERTScore, and quantified Likert Scores for grammar, fluency, accuracy, conciseness, and robustness via GPTScore. Above all, different versions of the interface were tested by subject matter experts. Based on those evaluation metrics, we were able to optimize the Retrieval step by selecting the best-performing text Embeddings model, Top K filtering, Context Re-Ranking Approach, and the Generation step by optimizing the Prompt Templates, LLM model selection and LLM generation parameters.

In other words, our approach was centered on creating a solution that was technologically advanced and aligned with the client’s mission and operational requirements. By implementing an advanced chat interaction system, designing a scalable and flexible solution, and conducting a comprehensive analysis of LLM APIs, we aimed to transform the client’s platform into a more dynamic, interactive, and user-friendly resource. This transformation was geared towards enhancing user experience and ensuring the platform’s long-term adaptability and effectiveness.

Key Benefits

Enhanced User Engagement

The introduction of our advanced chat interface significantly transformed user interaction on our client’s knowledge platform. This novel, interactive feature successfully captivated a younger audience, crucial for the platform’s growth. The chat interface’s ability to deliver quick, contextually relevant responses made the platform more intuitive and user-friendly, fostering longer and more meaningful engagements.

Empowerment with Advanced AI Knowledge

A key advantage for our client was gaining expertise in advanced AI technologies. Through our collaborative efforts, we not only implemented a solution but also equipped the client’s team with the knowledge to understand and utilize advanced language models and AI technologies. This empowerment is vital for their long-term technological strategy, paving the way for further innovations and applications in their service offerings.

Future-Proofing with Scalable Solutions

Our flexible and scalable solution design ensures effectiveness in the current tech environment and readiness for future advancements. This adaptability is especially beneficial for our client, enabling them to stay at the forefront of the digital evolution. The ability to integrate with upcoming software architectures and adapt to evolving hardware and security needs marks our client as a forward-thinking and technologically agile entity.

Informed Decision-Making Through Comprehensive Analysis

Our in-depth analysis of various AI technologies provided our client with essential insights for informed decision-making regarding their tech investments. Understanding the balance between cost, performance, and integration capabilities allowed the client to choose the most appropriate technology that aligned with their operational objectives and budget. This process of informed decision-making is a significant advantage, ensuring that the client’s investments are both strategic and cost-effective.

Team Involved

The project’s success is largely due to the dedicated efforts of our specialized team, particularly the significant contribution of an experienced AI and Language Processing specialist. This expert, with profound expertise in both the technical and practical aspects of advanced AI technologies and conversational interfaces, played a crucial role in guiding the project to fruition.

The NLP Expert

Role and Expertise: Our specialist brought extensive knowledge in advanced language processing technologies and their real-world applications. Their expertise in both theoretical and practical aspects of AI and language processing was key in designing and developing the advanced chat interface.
Responsibilities: The specialist’s responsibilities encompassed overseeing the development of the chat interface, integrating advanced language models with the existing knowledge base, and optimizing the system for peak performance. They were also instrumental in the comparative analysis of different AI technologies, providing insights that shaped our strategic decisions.
Collaboration and Knowledge Empowerment: Beyond development, the specialist worked closely with the client’s team, facilitating knowledge transfer, and enabling them with the skills to manage and further develop the chat interface. This collaboration ensured that the client received not just a cutting-edge solution but also the expertise to maintain and enhance it in the future.

Technologies Used

The development and successful implementation of the LLM-based chatbot for our client involved a carefully selected array of technologies. Each technology played a specific role in ensuring the chatbot was efficient, scalable, and capable of delivering a high-quality user experience. Here’s a detailed look at the key technologies used:

Open Source LLM Models from Huggingface Hub were selected based on their license, prediction quality, inference properties (number of model parameters, quantization ability, compatibility with inference libraries)
Aleph Alpha & OpenAI served as commercial LLM Service APIs to benchmark Open Source LLMs against for Answer generation. Also, commercial LLMs were used to generate questions for a testset and were used to compute GPTScore metrics for evaluation.
Langchain orchestrated the chatbot’s underlying NLP pipeline combining advanced NLP components like vector database-based Document Retrievers, Cross-Encoder based Re-Rankers, Context re-ordering, LLM Text Generation
S3 Bucket, Amazon’s cloud storage solution, provided the necessary robust and scalable data storage. It was crucial for securely handling the large index datasets, Text Embedding Model weights and Open Source LLM model weights involved in operating the chatbot.
vLLM was used as an LLM inference library to enable fast and batched Open Source LLM text generation on the client’s infrastructure. It leads to production-level inference speed by making use of flash attention and paged attention.
Deeplake was used as a vector database, for retrieval augmented generation, since it offers an API to persist and load the index directly from AWS S3 buckets. Also, the ability to be able to store metadata in addition to the text embeddings made Deeplake attractive.
Lastly, Streamlit facilitated the fast setup of user-friendly interfaces, for fast user testing.

The post Taking Information Access Experience to a New Level with a Generative AI Powered Chatbot appeared first on Positive Thinking Company.

From Theory to Practice: A Generative AI Workshop to Guide a Leading Bank

Cédric Antonini — Fri, 10 Nov 2023 10:02:29 +0000

In an era of data-driven innovation, our client, a leading European bank, sought to embark on a journey into the transformative world of Generative AI. With a mission to demystify this hype, we designed a tailored workshop to tackle the complexities of Generative AI and Large Language Models (LLMs) for an audience that included both technical and business experts.

This case study explores that journey, from initial theoretical immersion to strategic achievements, highlighting our role in sparking a new era of strategic innovation in banking. At the center is the pilot project that emerged from the workshop – an initiative that will not only increase the use of LLMs within the bank, but also support broader adoption of Generative AI. This evolution – from workshop to pilot and beyond – sets a path for embedding Generative AI at the core of the bank’s innovation strategy, providing a blueprint for internal expansion and technology adoption.

Context & Challenges

In a rapidly evolving financial landscape, staying ahead of the curve with the latest technological advancements is imperative for maintaining a competitive advantage. Our client, a leading bank operating in Europe, has always been at the forefront of adopting innovative solutions to enhance its services and operational efficiency. With a strong foundation in AI fields such as Machine Learning (ML) and Natural Language Processing (NLP), the bank has collaborated with us on several successful projects in the past.

As in many industries, the concept of Generative AI and LLMs started gaining traction for its potential to revolutionize various facets of the banking and financial services industry. Given our longstanding partnership and shared vision for innovation, our client approached us to conduct a comprehensive workshop on Generative AI.

The primary objective of the workshop was to demystify the principles of Generative AI, both theoretically and practically, supplemented with code examples to provide a hands-on experience. The bank envisioned this workshop as an initial step to evaluate the potential implications and applications of this emerging technology in their operations. Several key challenges were outlined by our client as the core focus of the workshop:

Broader Impact Assessment: Understanding the overarching impact of Generative AI on the bank’s operations, the financial industry in general, and the broader economy was crucial. This involved evaluating whether it was the right time to embrace Generative AI or if waiting for the technology to mature further was a more prudent approach.
On-Premise Feasibility: A significant concern revolved around the feasibility of implementing Generative AI solutions within an on-premise infrastructure. The focus was obviously on ensuring robust data security and adherence to regulatory standards due to the sensitive nature of financial data.
Technical Intricacies: Delving into the technical intricacies of implementing Generative AI, including the selection of suitable tools and the necessary infrastructure, was a key area of exploration. The workshop needed to provide a clear roadmap for navigating through the implementation of Generative AI.

Our Approach: A Generative AI Worshop for Banking

In order to address the multifaceted challenges posed by our client, we meticulously designed a workshop for exploring Generative AI. Our approach was tailored to cater to both the technical and business-oriented audience within our client’s team, ensuring a holistic understanding of Generative AI’s potential. It was structured into several key segments, each aiming at covering different aspects of Generative AI and LLMs:

Diverse Audience Engagement

Acknowledging the varied expertise and interests of the attendees, the workshop was designed to cater to a broad audience. The segments were carefully crafted to ensure that both the business and technical teams found the discussions enriching and directly relevant to their respective domains and perspectives.

Theoretical Exposure

The theoretical segments of the workshop served as the basis for building a solid understanding of Generative AI. We delved into its impact on business operations, the financial industry, and the economy at large. This section also covered essential tooling, cost implications, and various conceptual frameworks that underpin Generative AI, laying a robust foundation for the practical sessions that followed.

Practical Engagement

Transitioning from theoretical discussions to practical engagements, we orchestrated sessions focusing on prompting and implementing NLP solutions with LLMs. Participants were given the opportunity to engage in hands-on coding exercises or review the code examples we had prepared, thus getting a firsthand experience of implementing such advanced solutions.

Industry-Specific Insights

Incorporating industry-specific insights was crucial to make the discussions highly relevant and engaging. Drawing parallels from the banking sector, we presented statistics and use cases that resonated with the client’s domain, thereby facilitating a deeper comprehension of Generative AI’s applicability in the banking industry.

Project-Centric Discussion

An integral part of the workshop was dedicated to exploring how Generative AI could dovetail into ongoing and prospective projects. Engaging in an in-depth discussion, we collaboratively identified, prioritized, and challenged potential new use cases. This segment aimed at gauging the practicality and the value addition that Generative AI could bring to the table, in real-world project scenarios.

Our approach was calibrated to not only impart knowledge but also to stimulate thought-provoking discussions, enabling the participants to envision (all) the possibilities that Generative AI could unfold in the banking sector.

LinkedIn Banner Speakers Generative AI Strategy Webinar Positive Thinking Company

Key Benefits

The meticulously designed workshop yielded substantial benefits, helping our client to gain a clear perspective on the potential of Generative AI. Here are the key advantages we observed:

Informed Decision-Making:

The deep insights raised from the workshop empowered our client to make well-informed decisions regarding their infrastructure and project strategies. The comprehensive understanding of Generative AI’s implications provided a solid basis for evaluating its relevance and applicability in their operational landscape.

Methodology Reassessment:

One of the tangible outcomes of the workshop was the reassessment of methodologies employed in an ongoing KYC (Know Your Customer) check automation project. Together, we revisited the project strategies, and the newfound understanding of Generative AI triggered a shift in methods to enhance the project’s efficiency and effectiveness.

New Project Initiation:

The workshop also marked the inception of a new project focused on transforming unstructured data to structured data, a crucial undertaking in the banking sector. This brand-new project demonstrated the practical application of the knowledge acquired during the workshop, embodying the client’s confidence in the potential of Generative AI and LLMs.

Enhanced Technical Acumen:

The hands-on sessions and practical engagements during the workshop significantly reinforced the technical acumen of the participants. The opportunity to work on code and explore real-world examples enriched their understanding, preparing them for the technical challenges that lay ahead in implementing such advanced solutions.

Strategic Direction:

The workshop played a pivotal role in shaping the strategic direction of our client. By identifying and prioritizing new use cases, they were better positioned to align their strategies with the evolving technological landscape, thereby fostering a culture of continuous innovation and readiness to embrace emerging technologies.

The benefits derived from the workshop are a testament to the pragmatic and collaborative approach adopted, which not only facilitated a robust understanding of Generative AI but also catalyzed actionable steps towards its adoption.

Team Involved During this Generative AI Workshop in Banking

For this workshop to be effective and insightful, it was imperative to have a team with a profound understanding of both the theoretical and practical aspects of Generative AI. We therefore deployed a team of two NLP experts for a month to prepare and execute the workshop, and provide the necessary to our client.

The NLP experts brought a wealth of experience to the table. Their comprehensive understanding of Generative AI, coupled with their ability to elucidate complex concepts in a digestible manner, made them the ideal facilitators for this workshop. They were adept at tailoring the content to cater to both the business and technical audiences, ensuring a holistic understanding of the topics discussed.

In the follow-up project concerning the transformation of unstructured to structured data, one NLP expert was assigned to work alongside the client’s team for at least four months. This engagement facilitated a collaborative and effective environment, ensuring the successful implementation of the solution proposed during the workshop.

Technologies Used

The technical backbone of the workshop and the subsequent projects was supported by a suite of cutting-edge technologies:

Python: Being a versatile and widely-used programming language, Python was the primary language used for coding and implementing the solutions discussed during the workshop.
HuggingFace: This technology was leveraged for its robust NLP libraries, enabling the efficient implementation of Generative AI solutions.
Azure: Microsoft Azure provided the cloud infrastructure necessary for hosting and managing the AI models discussed during the workshop.
Databricks: Utilized for its big data analytics capabilities, Databricks facilitated the handling and analysis of large datasets, a common requirement in Generative AI projects.
Langchain: This technology was employed for its language processing capabilities, crucial for the NLP aspects of the workshop and projects.
Llama Index: Leveraged for indexing and searching capabilities, aiding in managing unstructured data effectively.
Haystack: This technology was employed for its search and indexing capabilities, streamlining the handling of vast amounts of data.

The post From Theory to Practice: A Generative AI Workshop to Guide a Leading Bank appeared first on Positive Thinking Company.

How to Successfully Drive Your CSRD/ESG Initiative With a Data-driven Approach

Cédric Antonini — Wed, 08 Nov 2023 15:08:11 +0000

From 2025 on, the Corporate Sustainability Reporting Directive (CSRD) will challenge European Union (EU) companies regarding their sustainability reporting. In its final stage, over 50K companies in the EU, including 15K companies from Germany, will have to report holistically on their sustainability performance, adhering to ESG reporting standards¹. This imposes a massive effort for these companies concerning data and sustainability management. However, the underlying goal of the CSRD poses an even greater challenge: mastering the transformation towards a net-zero and circular economy driven by robust data analytics and insights.

At Positive Thinking Company, we are convinced that thinking Sustainability and Data Products together, by taking a data-driven sustainability approach right from the beginning, is crucial to deal with this transformation. The upcoming transparency requirements and other ESG reporting challenges posed by CSRD are clear indicators that we are on the right track.

But HOW to bring all this to life in an organization to drive meaningful change?
Well, the best way to approach the transformation into a sustainable, digital future is to incorporate some key principles along with an integral data-driven focus.

Content

Focus on Actions That Have a True and Measurable Impact

Finding real, meaningful change can be daunting, especially when many solutions look like greenwashing. But it does not have to be like that. We don’t have to settle for small fixes that don’t make the difference.

To embark on a successful sustainability transformation powered by data, it is imperative for businesses to adopt an impact-driven mindset.

CSRD requires us to identify what information is relevant or “material” to our stakeholders. This can be achieved through a materiality assessment: bringing together both the impact and financial dimensions. A conscious and in-depth analysis of both the impacts on the environment and society as well as the risks and opportunities to our own business activities is a key first step. But this is also a unique opportunity to go beyond compliance and create a roadmap that focuses on the pursuit of genuine, measurable impact.

But how to achieve impactful results?

As a matter of fact: ‘you cannot manage what you don’t measure’. This is obviously the first step, but we want to go further. Therefore, a data-driven approach throughout the entire process will be key to identifying those actions that have a real impact.

Integrating sustainability metrics into traditional controlling processes will allow you to assess actions from cost, complexity, and impact perspectives all at once. To achieve this, strong capabilities in data collection from various sources, data analysis, and insight generation are needed. Additionally, reporting skills are essential to communicate these findings to both internal and external stakeholders.

Place Humans at the Center of All Your Activities to Drive Real Change

A human-centered approach is a goal in itself for more and more companies that understood that their endeavors cannot disregard the well-being of all people involved. Furthermore, by taking this path, cost optimization and profit will follow by:

attracting and retaining talent;
resonating with a wider audience that reclaims more human action from the companies they buy from;
reducing costs by optimizing the way tedious tasks are executed;
and so on…

Moreover, all stakeholders and employees must be involved in the transformation process to jointly create sustainable solutions for the future and hopefully become proactive agents of change. Encouraging innovation, collaboration, and knowledge-sharing empowers individuals to contribute to sustainability goals and aligns their individual purpose with the organization’s vision. As a starting point, this means that not only the Sustainability Office members need to sit at the table to discuss an appropriate data strategy for sustainability. On the contrary, stakeholders of every business unit should be involved.

Later on, to effectively support employees with the implementation of the developed strategies, an active moderation of the ongoing change process is required. This calls for an agile and comprehensive approach with regular feedback loops to build the change together.

Setting up a clear communication strategy combined with precisely defined metrics to measure the change and adoption during the process is critical. This facilitates the inclusion of all relevant actors in the company, fostering a sense of ownership and responsibility.

Choose a Holistic Approach Recognizing That a System Is More Than the Sum of Its Parts

It is essential to understand that sustainability goes beyond ‘just’ environmental concerns. It involves a broader spectrum, including often overlooked social dimensions, such us:

the well-being of communities,
fair labor practices,
and broader societal contributions.

Besides, the complexities of cross-impacts within a company highlight the need for an all-encompassing strategy. Integrating sustainability practices into every layer of business operations, from supply chains to employee engagement initiatives, fosters a broad transformation that extends beyond mere data collection. Utilizing data to gain 360-degree perspectives drives comprehensive transformation rooted in ESG reporting standards.

Through holistic and participative change support, organizational development is driven forward based on the strategies and measures that emerge from CSRD reports.

Keep Implementation Processes in Mind While Setting Up Your Sustainability Strategy

We believe that sustainability and digitalization must be at the core of a company’s strategy. But introducing them might be a prolonged process. Nevertheless, it should not be left only at the strategic level but quickly move forward to the implementation, generating fast and tangible results that can be measured and improved in following iterations.

A harmonious integration of strategy and implementation is key here. Ideally, companies aim for developing an actionable strategy that allows them to start with the first use cases while still finalizing their strategy. A solutions/use cases-oriented approach can help you identify those low hanging fruit that have high impact but lower complexity (and lower costs).

Value-Feasibility Matrix CSRD ESG Reporting Positive Thinking Company

Reduce Complexity to Kick Off Your Data-driven Sustainability Journey

Sustainability and data are both extraordinarily complex topics that can be addressed from multiple perspectives. Navigating the transformation process can be overwhelming, especially for small and medium companies.

Here again, narrowing down to solution-oriented approaches for the biggest challenges that companies have is immensely beneficial and the only way to truly make it manageable.

Expertise from the sustainability field and the data side should converge in a way that enables us to explain both worlds in everyday language. Sharing know-how and ideas throughout the company should happen on a level so that everyone from the worker in production or the customer service agent up to the top management can understand.

Measuring impacts is clearly the first step towards developing more sophisticated ways of reducing it. It is easy to fall into the trap of trying to solve everything at the same time. But identifying the biggest impacts from the beginning and addressing those first assures the right focus.

Tailored data solutions can be instrumental in this segmentation, offering bite-sized insights and strategies for companies to gradually embrace and integrate.

How to Start Putting All the Pieces Together?

Embracing sustainability is more than just paying lip service to a trend. It’s about:

Taking calculated, data-driven decisions and actions that create genuine, measurable change.
Centering efforts on actions with a genuine and quantifiable impact, instead of merely ticking boxes.
Placing humans at the heart of every initiative to ensure that the transformation not only benefits the environment but also resonates deeply with the very people driving the change.
Adopting a holistic approach, acknowledging the interdependent facets of business operations.
Moving quickly, keeping in mind that devising a strategy is crucial, but tangible implementation is equally significant. This involves identifying high-impact, low-complexity tasks to address right off the bat.
Reducing the complexity of these topics by breaking them down into manageable, actionable steps is the key.

As data and digitalization experts, both technically and strategically, we can support you on this journey. We have partnered up with sustainability specialists who focus on sustainable strategy, impact generation, and change management. Together, we co-create tailor-made plans rooted in data-driven insights.

Providing guidance on selecting the right digital tools, establishing robust monitoring systems, and ensuring continuous improvement through data-informed decision-making is our way. We’re here to guide, support, and empower companies in their journey towards a fairer, greener, data-driven future.

ESG Reporting solution preview Positive Thinking Company

¹Source: European Green Deal (europa.eu)

The post How to Successfully Drive Your CSRD/ESG Initiative With a Data-driven Approach appeared first on Positive Thinking Company.

Event Sourcing vs Conventional Data Management

Florian Paris — Fri, 03 Nov 2023 15:25:40 +0000

Building applications that are scalable and easy to maintain is the ultimate goal for us, Software Engineers. Event Sourcing, a paradigm that has been gaining a lot of hype in recent times, is one way to achieve this objective.

Event Sourcing is a concept that everyone has heard about, but the topic is often unknown, leaving many without a clear understanding of its principles and benefits. It disrupts traditional methodologies and revolutionizes application data management, opening up new design possibilities that were once unlikely.

Summary

Conventional Data Management

Relational Database Management System

The conventional data management approach involves directly storing the current state of an application’s data at any given point in time. Storing these data points is the standard for most databases and systems; when the state changes, the new replaces the old and the old vanishes. Although this strategy is more straightforward and more practical to set up, it may fail to record the complete history of the application’s development up to that point.

Example of Conventional Data Management:

Conventional-Data-Management

An Administrator creates a new User (Jane Doe). This is done by inserting a new record in the Users Table.

INSERT INTO Users (FirstName, LastName, BirthDate)

VALUES (‘John’, ‘Doe’, ‘1993-02-19’);

After inserting, the Users Table will appear as follows.

Conventional-Data-Management-2

Now, consider a scenario where there was an error in the First Name of the user with ID 2, and the administrator wishes to correct this by updating the user’s first name. In this situation, an UPDATE statement would be executed.

UPDATE Users

SET FirstName = ‘Bob’

WHERE ID = 2;

This statement modifies the record with ID 2, assigning the new value “Bob” to the FirstName column. As a result, whenever we query the table, we will observe “Bob” in the entry with ID 2.

Conventional-Data-Management-3

This database solution is suitable for most scenarios and delivers the expected functionality. However, difficulties emerge when we need to track changes in names, such as the transition from ‘John’ to ‘Bob’. The system lacks mechanisms for preserving historical data related to our dataset. There exist a few patterns and solutions that we could employ, like maintaining a history table or enhancing handlers with history-tracking behaviors.

However, these solutions aren’t inherent to the system, and incorporating these mechanisms could potentially introduce errors or suboptimal implementations. This consequently brings a risk of regression. Suppose we’ve indeed developed a robust system capable of capturing every change. This leads to another crucial question: How did we arrive at this point, and can we replicate that sequence of events? Now, attempting to introduce these mechanisms into an existing codebase can be quite a challenge, and this is where Event Sourcing comes into play.

Event-driven Data Management

Event Sourcing

The Event Sourcing design stands in contrast to the conventional data management approach. Instead of only storing the latest state, Event Sourcing captures and retains every change, action, or decision as an immutable event in a chronological log. These recorded events create a comprehensive historical record, documenting the entire evolution of the application over time.

By preserving the full history of data changes, Event Sourcing provides a detailed trail of past actions, offering valuable insights into the system’s behavior and facilitating thorough debugging and auditing processes.

Example of an Event-driven Data management:

Event-Data-Management

Let’s consider the example from the conventional approach. An administrator intends to create a new user and subsequently needs to alter their name due to a miscommunication. Instead of recording the data in a row and later modifying the value within this row, an Event-Driven system will log two events: a UserCreatedEvent and an UpdateUserEvent. These events encompass all the necessary data for executing actions on the entity (fc1c0c5d-f62c-…).

Event-Data-Management-2

So, now that we have the data stored, how does one go about querying that data? Let’s break down the process of querying an entity within an Event-Driven system. Initially, the events associated with a particular entity are retrieved from the Event Store. Following this, these events are replayed onto a model. It’s important to note that the model is technology-agnostic. When we mention “replaying onto a model”, we’re referring to the process of iterating through each event and applying them to the model.

The Model defines and encompasses the rules for interpreting each Event and constructing a state. This ultimate state comprises all the information of the Queried Entity.

By maintaining Events at the core of the system, a history log is established, enabling the capability to replay events. This abstraction provides the ability to engage in time travel and manipulate the system state to your benefit.

Summary

Data Storage:

Event Sourcing: Events are stored in a sequential manner, building a historical record of all significant changes in the application’s state over time.
Conventional Approach: Only the current state is stored, and the history of changes leading to the current state is often not explicitly captured.

Reconstruction:

Event Sourcing: The application’s current state can be reconstructed by replaying the sequence of events from the beginning up to the present moment.
Conventional Approach: The current state is readily available, but there might not be an easy way to recreate past states without additional tools or backups.

Conclusion

Event Sourcing is a game-changer for applications data management and it’s clear that this paradigm shift is revolutionizing the industry. Instead of solely focusing on the present state of data, Event Sourcing places emphasis on capturing the history and context of data changes.

Keep in mind that Event Sourcing is not a one-size-fits-all solution. It should only be adopted when truly necessary, as it comes with a cost – the cost of added complexity.

When designing a system, we recommend adhering to the principle of prioritizing simplicity and consistency over unnecessary complexity. Introducing Event Sourcing does bring an extra layer of complexity to your system, so make sure you have a solid rationale for doing so.

If you’re interested in implementing Event Sourcing, you might find Greg Young’s repository helpful.

Other sources:

More Resources

To learn more about Generative AI in an Agile environment, you can take a look at our insight: Agile Architecture, a key practice for embracing changes.

Unlocking the Potential of Data-centric AI in Generative AI and NLP

Cédric Antonini — Fri, 27 Oct 2023 09:35:21 +0000

Let’s envision for a moment a future with robust and reliable Large Language Models (LLMs) with fewer weights, and fast inference times, which are production-friendly and thoroughly evaluated.

The path towards achieving this goal is not to gather larger and larger datasets and train LLMs with even more model parameters. Interestingly, the opposite “Less is more” approach is much more promising. Data-centric NLP techniques generate high-quality, debiased, information-rich datasets for pre-training, finetuning, and alignment in a semi-automated manner. The challenging discipline of Data-Centric NLP focuses on developing datasets at scale and gives model optimization a secondary priority.

But what makes this approach pivotal? Why should Data Science and AI teams even consider it?

This article aims to demystify the concept of Data-centric AI, elucidating why it’s making a substantial impact in Generative AI and NLP, and how you can practically implement it in your projects. Through a clear and comprehensive exploration, we’ll dive into the essence of Data-centric AI, providing valuable insights to both business and technical audiences.

Content

What is Data-centric AI?

Data-centric AI is a straightforward yet powerful concept: the quality and expressiveness of your data is central to the success of AI projects. Let’s break this down into simpler terms.

Data-centric AI is the discipline of systematically engineering the data used to build an AI system.
— Andrew Ng

For more: have a look at “A Chat with Andrew on MLOps: From Model-centric to Data-centric AI DeepLearningAI” video on Youtube.

In traditional AI projects, a lot of focus is put on the models and algorithms. Think of the model as the engine of a car. A lot of effort goes into making this engine as powerful and efficient as possible.

However, in a data-centric approach, the focus shifts towards the data, which you can think of as the fuel for the engine. Just as a car runs better with high-quality fuel, an AI model performs better with high-quality data.

This means that instead of spending most of our effort on tweaking the engine, we ensure that we’re using the best fuel available. In practical terms, this involves:

cleaning up the data,
making sure it’s relevant,
and that it accurately represents the problem we want the model to solve.

For both business and data experts, this approach emphasizes the importance of the data you feed into your AI models. Better data leads to better performance, making your AI projects and products more successful and reliable.

Why Data-centric AI is a Game-Changer in NLP?

Well, why is focusing on data such a big deal in NLP? Below we explore why this approach is making waves and changing how we handle Generative AI and NLP projects.

Improving Model Performance: When the data is clean and representitive, AI models can learn generic patterns more effectively. In NLP, this means that models can understand and process language in a way that’s more accurate and useful.
Saving Time and Resources: By putting an emphasis on quality data, we can save a lot of time that would otherwise be spent constantly tweaking and adjusting the AI models. This makes the development process way more efficient. Just consider how hard it is to perform regression test for LLMs upon redeployment, to ensure a new model version is at least as good as the previous model version.
Enhancing Adaptability: With a strong foundation of quality data, NLP models become more adaptable. They can better handle new information and changes, making them more versatile and reliable in real-world applications. Consider the dataset as a kind of schoolbook. If the key concepts are clearly outlined, it is possible to “connect the dots” and come up with inspiring new ideas. If irrelevant and duplicated content is shown to a learner, it is hard to develop a useful mental model of knowledge to be recombined.
Facilitating Better Decision-Making: In business, having an NLP model that provides accurate and reliable results means that decision-makers have better information at their fingertips, leading to more informed and effective decisions.

In simple terms, a Data-centric approach makes any NLP project more robust, efficient, adaptable – which is essential for meeting the diverse and dynamic demands of the rising LLMs.

Banner White Paper Data-centric AI for Natural Language Processing NLP

How to Implement a Data-centric AI Approach?

Implementing a Data-centric approach in your NLP and Generative AI projects at scale (for terabytes of text data) might seem daunting, but it doesn’t have to be. Here are some practical steps and insights to guide you through the process.

1. Data Collection and Organization

Focus on quality: Try to collect high-quality and relevant data from the first place. Ensure that the data is representative of the real-world scenarios you want your model to handle. It will pay out in the long run.
Pay attention to organization: Organize the data in a structured manner, making it easier to access and use during the model training process.

2. Data Cleaning

Remove irrelevant data: Not all data collected will be useful. Identify and remove information that doesn’t contribute to the model’s learning process.
Handle missing data: Decide how to handle gaps in the data, whether by removing such instances or filling them in a logical manner.

3. Data Annotation

Adopt a labeling strategy: If your project requires labeled data, ensure that the labeling is accurate and consistent.
Utilize tools and services: Consider using available tools and services that can assist in the data annotation process, making it more efficient.

4. Continuous Improvement

Implement feedback loops: Implement feedback mechanisms to continuously improve and refine your data and models.
Stay updated: Data needs can change over time. Stay updated on the latest trends and best practices in Data-centric AI to keep improving your processes.

Adopting a Data-centric approach is about focusing on the data as a key component in the success of your AI projects and products. With quality data, and the right approach, you can improve the performance and reliability of your AI models, ensuring the success of your projects.

What are the Future Perspectives on Data-centric AI in Generative AI and NLP?

Looking forward, the emphasis on Data-centric AI is poised to continue shaping the trajectory of Generative AI, NLP, and LLMs innovations and applications. Here’s a glimpse into what the future might hold as this approach becomes more deeply integrated into AI projects.

1. Enhanced Model Performance & Lower-Inference Latency

As more projects adopt a Data-centric approach, we can expect to see improvements in how NLP models and LLMs perform, making them more accurate and effective in various applications. Smaller models might outperform larger models and result in low inference latency, better maintainability, and lower deployment costs.

2. Broader Applications

With improved data quality, NLP models could be applied in more diverse areas, expanding their usefulness and impact across different industries and sectors.

3. Improved Collaboration

A focus on data can facilitate better collaboration between technical and non-technical stakeholders, as it allows for a clearer understanding and alignment of project objectives and outcomes.

4. Ethical and Responsible AI

A data-centric approach promotes the consideration of ethical implications, encouraging the development of NLP models that are more responsible and sensitive to societal impacts. Regulatory compliance and AI safety is guaranteed if compliance already takes place at the level of the training dataset. This might reduce the effort of content moderation filtering after model deployment.

5. Continuous Learning and Adaptability

Emphasizing data can lead to models that are better at learning and adapting over time, making them more resilient and capable of handling new challenges and changes in the language.

In other words: the road ahead for Data-centric AI in NLP looks promising, with the potential for numerous advancements and improvements that will drive success in various projects and applications. It’s an evolving journey that carries the promise of making Generative AI, NLP, and LLMs more robust, adaptable, and aligned with real-world, business needs and challenges.

Data-centric AI – Takeaways

Navigating the complexities of Generative AI and NLP can be a very challenging journey. However, adopting a data-centric approach has the potential to be a transformative strategy that improves the performance, adaptability, and success of your AI projects. You’re now equipped with a foundational understanding and actionable guidance on the key concepts, practical insights, and future perspectives of data-centric AI.

As we stand on the threshold of new possibilities and innovations in NLP, embracing data-centric practices presents a compelling opportunity to drive progress and achieve remarkable outcomes. For a more profound exploration and comprehensive insights into maximizing the potential of Data-centric AI in Generative AI and NLP, we invite you to download our complete white paper on the topic.

More Resources on Generative AI and NLP

Explore other content and tutorials from our recent LLM Mini-Series:

Episode 1 – API Integrations with LLMs: Talking to the Databricks API with text-davinci-003
Episode 2 – Parallel Multi-Document Question Answering With Llama Index and Retrieval Augmented Generation
Episode 3 – From Unstructured Data to Structured Data with Open-Source LLMs
Episode 4 – Querying SQL Databases in Natural Language Using the Langchain Library

The post Unlocking the Potential of Data-centric AI in Generative AI and NLP appeared first on Positive Thinking Company.

Positive Thinking Company

Hello Camunda 8

Camunda Components & Architecture

Zeebe clients

Take advantage of Camunda 8 Platform

Prerequisites

Registration

Orchestrate your first BPMN process with Camunda Platform 8

Design and deploy a process

Create a cluster and credentials

Cloud & Sustainability: How to make your cloud usage more sustainable?

Creating Native Images in Spring Boot

Metadata for Native Image Compilation

Configuration Files for Native Image Compilation

Reflections Configuration

Dynamic Proxies Configuration

JNI, Resources, and Serialization Configuration

Third-Party Library Configuration Files

GraalVM’s Tracing Agent

Addressing Limitations of the Tracing Agent

The GraalVM Reachability Metadata Repository

Conclusion

Generative AI Strategy: From Understanding to Actionable Roadmap

Achieving ESG Reporting: How SMEs Can Benefit from NLP

First Thing First: Definitions and Context

What is ESG/CSRD Reporting?

What is NLP?

Bridging ESG Reporting and NLP

NLP for ESG Data Identification

The Challenge: Diverse and Scattered ESG Data

The Solution: NLP’s Advanced Capabilities

Innovative Models and Techniques

Aligning with Regulatory Standards

In short: NLP – A Valuable Tool for ESG Data Identification

NLP for ESG Data Collection

The Challenge: Varied Sources and Formats

The Solution: NLP Diverse Techniques

Practical Applications and Expertise

Ensuring Quality and Traceability

In short: NLP – Enhancing ESG Data Collection

NLP for ESG Data Processing or Enrichment

The Challenge: Analyzing and Enhancing ESG Data

The Solution: NLP’s Analytical Techniques

Practical Applications and Innovations

In short: NLP – Elevating ESG Data Processing and Enrichment

NLP for CSRD Report Creation

The Challenge: Crafting Comprehensive CSRD Reports

The Solution: NLP’s Generative Capabilities

Practical Applications and Innovations

Ensuring Accuracy and Reliability

In short: NLP – Aiding in the Creation of Reliable CSRD Reports

Harnessing NLP for Enhanced CSRD/ESG Reporting

Read more on Sustainability:

Read more on AI:

Taking Information Access Experience to a New Level with a Generative AI Powered Chatbot

Context & Challenges

Our Approach

Implementing an Advanced Chat Interaction System

Designing a Scalable and Flexible Solution

Comprehensive Analysis of LLM APIs

End-to-End Evaluation of the LLM Application

Key Benefits

Enhanced User Engagement

Empowerment with Advanced AI Knowledge

Future-Proofing with Scalable Solutions

Informed Decision-Making Through Comprehensive Analysis

Team Involved

The NLP Expert

Technologies Used

From Theory to Practice: A Generative AI Workshop to Guide a Leading Bank

Context & Challenges

Our Approach: A Generative AI Worshop for Banking

Diverse Audience Engagement

Theoretical Exposure

Practical Engagement

Industry-Specific Insights

Project-Centric Discussion

Key Benefits

Informed Decision-Making:

Methodology Reassessment: