Breaking Good

Logo. Powered by Anthropic's Claude.

Inspiration

One of our team members is affiliated with Berkeley’s Neural Systems and Machine Learning Lab. There, we were made aware of how early-stage neuropharma researchers spend weeks, even months, gluing together literature searches, ad-hoc simulation scripts, and manual regulatory estimates—and how that bottleneck stalls innovation. Given our collective experience(s) in artificial intelligence and machine learning, as well as an API credit grant from Anthropic for our development intentions, we decided to attempt to collapse that workflow into an interactive, end-to-end platform.

What it does

Breaking Good pulls full-text neuroscience papers from Google Research via the Crossref API, looks up chemical structures and bioactivity from PubChem/ChemBL, and uses RDKit to simulate binding, toxicity, stability, and synthetic yield. After receiving both a system and user prompt (ephemeral message, almost), Claude reads the literature, proposes SMILES candidates with step-by-step rationales, justifications, and visualizations, and generates a regulatory roadmap forecasting FDA timelines and production feasibility via web-scraping patent and academic databases (enabled by Model Context Protocol [MCP]). It also offers a manual structure editor for researchers to iterate about candidate compounds by manipulating (i.e. draw, edit, build upon, validate, and export) molecular structures with full control over atoms, bonds, stereochemistry, and file formats using Ketcher Standalone. Lastly, users can conduct a side by side comparison of the molecular properties of generated/edited candidates (based upon Tanimoto Similarity) and run simulations to gain comprehension of ADMET properties and Receptor Binding Affinity.

How we built it

We constructed this platform as a web application with two main parts working together: a frontend the user interacts with in their browser, and a backend server that handles logic and data.For the frontend, we used React. When a user opens the application, their browser loads our React code. The main App.js component sets up the overall structure using Material-UI components – we have an AppBar at the top, a navigation Drawer that slides out, and Tabs to switch between the main functional areas like the MoleculeDesigner or SimulationPanel. We manage the user's current view and some shared data (like global notes saved in localStorage) using React's built-in state management (useState, useEffect). When the user performs an action, like designing a molecule or requesting a simulation, the relevant React component (e.g., MoleculeDesigner) likely uses the axios library to send a request over the network to our backend API. We also included libraries like 3dmol, plotly.js, and smiles-drawer to render interactive chemical structures and data plots directly in the browser.For the backend, we built it primarily with Node.js using the Express framework. Our index.js file is the starting point. It sets up an Express server that listens for incoming API requests from the frontend. We configured standard middleware: cors to allow requests from our frontend's address, helmet to add some basic security headers, and morgan to log incoming requests for debugging. We organized our API logic into different route files (like api/drugDesign.js, api/simulation.js, api/auth.js). When a request hits an endpoint (e.g., /api/drug-design/calculate-properties), the corresponding Express route handler takes over. For complex scientific computations that Node.js isn't well-suited for, we implemented a bridge: the Node.js handler uses the built-in child_process.spawn function to execute specific Python scripts we wrote (located in utils/rdkit/). These Python scripts leverage powerful libraries like RDKit to perform the calculations (e.g., property prediction, similarity searches). The Python script processes the data and sends the results (usually as JSON) back to the Node.js process, which then sends the final response back to the user's browser via the API. We also integrated the @anthropic-ai/sdk likely within our /api/ai route to leverage external AI capabilities. Authentication seems to be handled via routes in api/auth.js probably using jsonwebtoken and bcrypt.

Challenges we ran into

For the regulatory analysis feature, we initially considered building our own web scrapers to gather information from sites like the FDA, patent databases, etc. However, we recognized the inherent difficulties in building and maintaining reliable scrapers for constantly changing websites. So, we opted to delegate the web scraping and synthesis task to the Anthropic Claude API. Our main challenge then became prompt engineering – figuring out how to precisely instruct Claude via text prompts to find the specific regulatory information we needed (like similar drugs, FDA approvals, patent details) and then structure its findings into the detailed report format we required. We also faced the challenge of parsing Claude's text responses, as the AI didn't always return perfectly structured data. We had to write regular expressions to extract the different sections (summary, patents, pathway, etc.), which was somewhat fragile, and we built in fallbacks to provide mock data if the API call failed or if our parsing logic couldn't make sense of the response.Similarly, for the chatbot feature, a key difficulty was ensuring the chatbot had the right context, specifically about a molecule the user might have just generated or selected via a separate API call (like /generate-molecule). Since our /chat API endpoint on the backend didn't automatically know about the results of previous, unrelated API calls, we designed it to accept the entire conversation history from the frontend. This meant the challenge fell largely on the frontend implementation: we had to ensure that when the user started chatting after selecting a molecule, the frontend code correctly grabbed the relevant molecule details (SMILES, properties) and inserted this context into the messages array before sending it to the backend /chat API. If the frontend didn't pass this context correctly, the chatbot wouldn't be aware of the specific molecule the user was referring to.

Accomplishments that we’re proud of

We take most pride in our high-level molecular visualization capabilities through two key integrations. First, we embedded the Ketcher Standalone structure editor within our React frontend, establishing robust two-way communication that allows users to load AI-generated molecules, make real-time modifications, extract structures as Molfiles, and access essential editing functions. Second, we implemented 3D visualization by leveraging RDKit on the backend to transform complex SMILES text formats into computational 3D structures, which are then rendered on the frontend using 3Dmol.js as interactive, rotatable models. This creates a seamless environment where scientists can directly manipulate and visualize chemical structures, significantly improving their understanding of molecular spatial arrangements and augmenting overall research productivity.

What we learned

This project really underscored the challenges and nuances of applying large language models, like Claude, in a field like drug R&D where precision and reliability are paramount. A major learning curve involved reproducibility. We found that even with fixed parameters like temperature, getting the exact same output from the AI for the same input wasn't always guaranteed, especially across different model versions or minor updates. This variability is tricky in a scientific context where consistent results are expected. This led us to realize how absolutely critical prompt engineering is. Since we didn't have the ability to train or fine-tune a custom machine learning model specifically on neuropharmacology or regulatory data, crafting the perfect prompt became our primary tool for guiding the AI. We learned that subtle changes in wording, the structure of the request, the persona we asked the AI to adopt (e.g., "You are a medicinal chemist..."), and explicitly telling it how to structure its answer (like demanding "SMILES: [string]" on its own line) drastically affected the quality, relevance, and usability of the outcomes. Getting useful, structured data back required significant iteration and careful prompt design. Furthermore, working in a space with such a small margin for error meant we couldn't just trust the AI's output blindly. Generating a chemically invalid molecule or providing inaccurate regulatory advice could have serious consequences. We had to constantly grapple with optimizing for accuracy using only the tools available – primarily prompting and external validation. Because we were leveraging a general-purpose AI without deep, baked-in domain expertise from custom training, we had to implement strict checks on its output. For example, we couldn't just accept a SMILES string Claude generated; we had to pipe it through RDKit on our backend to validate its chemical plausibility. This post-processing validation became non-negotiable. Essentially, we learned that using current LLMs effectively in high-stakes scientific domains requires treating them as incredibly powerful but sometimes fallible assistants; meticulous prompt engineering is needed to guide them, and rigorous, domain-specific validation of their output is essential to ensure accuracy and safety.

What’s next for Breaking Good

Next, we’ll extend Breaking Good beyond neuroscience into oncology, immunology, and rare‐disease R&D by fine-tuning Claude on new domain-specific corpora and training bespoke simulation and feasibility models for each area. We’ll integrate specialized APIs and toolkits—like protein-ligand docking engines for cancer targets and immunogenicity predictors for vaccine candidates—and build additional MCP services to handle new data sources and ML-driven regulatory projections. We hope that this modular, model-centric approach will let us rapidly onboard new therapeutic areas, leverage pre-trained and custom-trained models, and ultimately automate in-silico design, simulation, and regulatory forecasting across the full spectrum of drug discovery.