12 Aug 2024 / 6:00 PM / ATLAS Building, CU Boulder

We’ve all struggled to get the response we want from a Large Language Model. Emerging techniques now offer better ways to control LLM responses, ensuring more consistent formats and reliability in product workflows. We’ll also discover how advanced observability can enhance your LLM experience and learn the importance of protecting sensitive information across generative AI interactions.

Our first speaker, Uche Ogbuji will give a talk exploring emerging techniques to better control LLM outputs, enhancing their reliability in product workflows.

Uche Ogbuji is an AI engineering lead, consultant, and startup founder with a long history in AI, data, and network technologies. He contributes to open-source projects like OgbujiPT, co-founded the AI DIY YouTube show, and is also a writer, public speaker, and artist. He also leads the AI for Entrepreneurs and Startups (AES) Subgroup for RMAIIG.

Next, SallyAnn DeLucia, Product Manager at Azize will explore the use of guardrails, datasets, and experiments to enhance AI application performance and reliability. This talk will provide AI practitioners actionable insights into optimizing their development processes to achieving cutting-edge results once applications hit production.

SallyAnn DeLucia is a Product Manager at Arize AI, a leader in LLM observability & evaluation. A passionate machine learning enthusiast and generative AI specialist, she holds a master’s degree in Applied Data Science. Delucia combines a creative outlook with a dedication to developing solutions that are not only technically sound but also socially responsible.

Finally, Aaron Bach, CTO of Liminal will discuss how organizations are navigating the challenges of data security and privacy when it comes to LLM use, including employee attitudes towards AI security policies, and the importance of creating tools that satisfy both security teams and end users.

Aaron Bach is a seasoned product development leader with over 15 years of experience in software, hardware, and innovation for Fortune 500 companies. He has led diverse teams and delivered impactful solutions, including overseeing venture concepts and patentable IP at FIS. Previously, Aaron was SVP of Software Development at Four Winds Interactive, where he played a key role in its acquisitions and platform development.

Thanks to NexusTek for sponsoring pizza! NexusTek provides consulting and managed Hybrid cloud solutions to reduce the cost and risk to AI development and operations. https://www.nexustek.com/

Notes

The meeting focused on controlling and constraining large language models (LLMs). Dan Murray discussed the group’s 1600 members and upcoming events, including an engineering subgroup meeting on LLM output stages and a Women in AI meeting with 44 RSVPs. Uche Ogbuji emphasized the importance of structured output and tool calling to enhance LLM reliability. SallyAnn DeLucia highlighted Rise AI’s AI Copilot, which uses data sets and experiments for testing. Aaron Bach addressed data security and privacy concerns in AI, noting that 75% of regulated employees use unapproved LLMs, and Liminal ensures secure AI usage by redacting sensitive data.

Key Takeaways

Some of the key takeaways from the meeting:

Uche Ogbuji

Discussed the challenges of using large language models (LLMs) in production environments and the need for structured output and guidance
Explained the concept of “prompt engineering” and how it is becoming less important as LLMs become more advanced
Introduced the idea of “structured output” and “grammar-structured output” as a way to control and constrain LLM outputs
Described the two-step process of training the LLM to understand grammar rules, and then guiding the LLM to follow those rules
Provided an example of using a JSON schema to define the structure and allowed outputs for an LLM-powered restaurant menu ordering system
Emphasized the importance of “tool calling” and agentic frameworks, which allow LLMs to interact with external APIs and applications in a controlled manner
Highlighted the recent announcement by OpenAI regarding structured outputs in their API, which significantly improves control and reliability
Discussed the concept of “vector steering,” which is a more sophisticated approach to prompt engineering using latent representations
Stressed that the future of LLM integration will involve less manual prompt engineering and more automated, constrained workflows
Provided examples of open-source tools like Llama CPP that support negative prompting and other advanced prompt control techniques

SallyAnn DeLucia

Provided an overview of Rise AI’s AI assistant, Copilot, and the importance of optimizing its performance and reliability
Explained the architecture of Copilot, including the use of a “router/planner” to select the appropriate skills and functions to call
Discussed the key components of the router/planner, such as platform data, debugging advice, and state management
Highlighted the popularity of Copilot’s search function and the need to expand its capabilities based on user feedback
Described the process of building data sets and experiments to test and validate Copilot’s function selection
Outlined the steps involved in creating an experiment, including defining the data set, task, evaluator, and GitHub action
Demonstrated how the GitHub action automatically runs the experiment whenever changes are made to the Copilot search functions
Explained the use of the Rise AI dashboard to view the results of the experiments and analyze the performance of the LLM
Discussed the importance of providing explanations for the LLM’s decisions, which helps with understanding and debugging
Mentioned the concept of “LLM light analytics,” where the LLM is used to categorize data and provide insights
Emphasized the value of data sets and experiments in iterating on the Copilot product and improving its reliabilityMatt’s company takes a people-centric approach to AI adoption focused on innovation, culture change and cohesive strategic planning

Aaron Bach

Emphasized the importance of addressing regulatory compliance and sensitive data leakage in AI applications, particularly in regulated industries like healthcare and finance
Highlighted the paradox of AI’s potential benefits and the need for secure and compliant usage in these regulated environments
Discussed the challenges faced by Chief Information Security Officers (CISOs) in these industries, who are often tasked with saying “no” to new technologies
Explained how Liminal aims to enable CISOs to say “yes” to generative AI by providing a secure and compliant platform
Described Liminal’s approach to detecting and handling sensitive data in AI prompts, including redaction and intelligent masking
Discussed the three main modalities of generative AI usage that Liminal supports: chat, in-app, and app development
Demonstrated the Liminal platform’s administrative dashboard, which allows for fine-grained control and governance over AI models
Highlighted Liminal’s ability to connect to multiple AI model providers, giving organizations flexibility and choice
Explained Liminal’s policy controls, which allow administrators to define how sensitive data should be handled
Provided an example of how Liminal would handle a prompt containing sensitive information, redacting or masking the data while preserving context
Emphasized the importance of not just focusing on technical solutions, but also understanding the needs and constraints of end-users in regulated industries
Shared insights from conversations with healthcare and financial services professionals, highlighting their desire for AI tools that can enhance productivity and efficiency
Discussed the concept of “model agnostic assistance,” where Liminal aims to route prompts to the most appropriate AI model based on the task and user needs

Q&A

During the Q&A portion:

A question was asked about Liminal’s pricing and whether it could be a cost-effective solution for small startups. Aaron Bach responded that Liminal is designed to be affordable for organizations of all sizes, even individual practitioners, as they aim to provide a more cost-effective alternative to the infrastructure costs of running large language models.
Bill McIntyre asked about Liminal’s approach to handling HIPAA compliance and data security in the healthcare industry. Aaron Bach explained that Liminal works closely with customers to ensure they meet all necessary regulatory requirements, including signing business associate agreements. He also discussed Liminal’s focus on providing transparency and observability around AI interactions.
A question was raised about the potential for prompt injection vulnerabilities and how Liminal addresses this. Aaron Bach acknowledged that prompt injection is a valid concern, but stated that Liminal’s focus has been on the more common use cases they’ve observed in regulated industries, where malicious prompt injection has not been a significant issue. He emphasized Liminal’s approach of working closely with customers to understand their specific needs and risks.
There was a discussion around the concept of “extractive AI” versus “generative AI” and how the terminology is used. Bucha provided some context on the historical evolution of these terms and why “generative AI” has become the more common parlance, even when the AI system is primarily extracting and transforming data.
A question was asked about Liminal’s approach to handling attachments or additional data sources provided to the AI system, and how they ensure the security and integrity of that information. Aaron Bach and SallyAnn DeLucia discussed Liminal’s strategies for detecting and handling sensitive data in these scenarios, including the use of heuristics and ongoing collaboration with customers.
The speakers were asked about their strategies for explaining higher-order prompting and prompt engineering concepts to non-technical users. The panelists acknowledged the challenge and discussed approaches like using frameworks from software engineering, as well as focusing on the end-user experience rather than the technical details.

Click here to see a full transcript/recording.