Inspiration

Modern organizations operate across multiple clouds—Azure, AWS, and GCP—each with its own security tooling, logs, and compliance requirements. We were inspired to simplify and unify the security insights process by bringing all of this siloed data together under one Retrieval-Augmented Generation (RAG) platform.

What it does

SecuRAG aggregates multi-cloud security data (logs, alerts, vulnerabilities) and uses Cortex Search to retrieve the most relevant information. It then leverages the Mistral LLM (mistral-large2) on Snowflake Cortex to generate concise, context-aware answers and recommendations. Users can quickly query real-time security insights—like identifying misconfigurations or compliance gaps—via a user-friendly Streamlit interface.

How we built it

Data Ingestion: We pulled cloud security logs and alerts into Snowflake. Index & Retrieval: Cortex Search indexes this data to enable rapid, semantic-based queries. LLM Generation: Mistral LLM takes the retrieved context and crafts accurate, security-focused responses. Front End: Streamlit Community Cloud hosts a streamlined web app for live demos, giving security analysts an intuitive UI to ask questions and view results. TruLens: We integrated TruLens to measure and compare retrieval performance and optimize search for the best results.

Challenges we ran into

Data Normalization: Converting diverse logs and alerts from multiple cloud providers into a consistent format was complex. Performance Tuning: Optimizing the retrieval process for speed and accuracy, especially with large volumes of security data. Integration Complexity: Ensuring that the Mistral LLM, Cortex Search, and Streamlit communicated seamlessly within the Snowflake environment.

Accomplishments that we're proud of

Actionable Recommendations: The LLM-generated outputs aren’t just summaries—they offer meaningful remediation steps and best practices. Scalable Architecture: By leveraging Snowflake and Cortex Search, the system can handle growing datasets with minimal reconfiguration. Quantifiable Improvements: Using TruLens, we identified measurable gains in precision and recall when refining our search pipelines.

What we learned

The importance of contextual retrieval: High-quality search results dramatically boost the utility of LLM-generated answers. Effective prompt engineering: Tailoring prompts with domain-specific security terminology significantly improved response accuracy. Incorporating feedback loops and metrics (TruLens) can systematically enhance both retrieval and generation quality.

What's next for Securag

Azure Copilot Integration: Embed our RAG workflow into Native Cloud LLM Agent, such as Copilot so that developers and security teams can query it seamlessly within their cloud-native DevOps processes. Automated Remediation: Move beyond recommendations to implement automated fixes for misconfigurations or vulnerabilities under controlled approval workflows. Fine-Tuning the LLM: Further refine Mistral with domain-specific security data to boost accuracy and contextual understanding. Modular, Automated Log & Alert Integration: Create standardized connectors to ingest new data sources and alerts from any cloud platform or third-party security vendor with minimal modifications.

Built With

Share this project:

Updates