An Article on Managing AI Product Risk

By Alex Kalinin, Sid Palani, and Vik Chaudhary

Generated by Google’s ImageFX

The allure of AI (particularly Generative AI) is undeniable. Its potential seems limitless, from generating captivating images and videos to powering self-driving cars to aiding medical diagnoses. However, behind the glitz and glamor lies a complex reality riddled with risk. As AI experts and previous Meta AI Infrastructure organization employees, we find navigating this “Wild West” of innovation thrilling but seek to do so with cautious optimism.

Let’s explore potential pitfalls lurking in the shadows and suggest some tactics to address them, aiming to equip ourselves for a safer, more responsible journey through the frontier. Some of the risks we will cover are:

Poor Accuracy
Security, Privacy, Ethics, and Regulation
Dependencies on Third-Party AI Models
Cost of Implementing an AI Architecture

Please note that this list is not exhaustive and is focused on the risks of implementing AI in products.

Accuracy, Bias, and Quality: When Hallucinations Outweigh Truth

Imagine feeding a story prompt to an AI and receiving a masterpiece you swear could have been penned by J.K. Rowling herself. Sounds magical, right? Well, hold on. Generative AI models, like LLMs and diffusion models, are essentially digital storytellers or artists trained on massive datasets. While they weave impressive narratives, they often suffer from “hallucination”–fabricating details not present in the data. While this may feel low-risk in creative use cases popular today, the risk of misinformation increases dramatically in use cases where high trust and adherence to policies and regulations are required, such as healthcare and financial services. Another risk is biased results. Models may lead to lower accuracy predictions for underrepresented contexts in the training data. Finally, models may not provide accurate information because they do not have access to the most up-to-date data relevant to the use case.

Biases enter Machine Learning projects from data, especially training data and algorithms:

Labeling Bias: Errors in labeling data introduce bias in subjective fields like sentiment analysis or image classification.
Historical Biases: Datasets will reflect societal biases such as gender or racial biases.
Sampling Bias: Datasets may not be representative of the population they aim to model, leading to predictions that are skewed.
Algorithms Bias: ML algorithms may favor certain groups or exhibit bias in their decision boundaries.
Human Bias: Developers or data annotators may introduce their biases into the ML pipeline because of domain experience or confirmation bias.

What Can We Do About It?

Require monitoring and evaluation to be in place before launching AI features, both at the model level and user feedback level. This means aligning on a definition of accuracy, implementing dashboards with appropriate alerts (including training data quality), and ensuring a robust set of channels to monitor feedback directly from users.
Delegate a Red Team, a set of people tasked with trying to make models support illicit behavior, make biased recommendations, and break privacy rules to divulge personal information.
Set clear expectations for users on known gaps. Recognize that it is not possible to train AI on all possible scenarios and data. (e.g., “Gemini may display inaccurate info, including about people, so double-check its responses.”).

The Precarious Balance: Privacy, Ethics, Security, and Regulation

Our exploration of the AI sandbox brings urgent questions of security, privacy, ethics, and regulation to the forefront. Let’s briefly touch on each area and also mention some startups actively working in them:

Privacy: The potential for AI to exploit personal data, manipulate behavior, or infringe on individual rights raises serious privacy concerns. Existing regulations provide a foundation, but the fragmented landscape creates gaps. We need a unified framework that balances innovation with data protection, empowering individuals to control their information.

Fiddler develops privacy-enhancing computation (PEC) tools that allow companies to train and use AI models on sensitive data without compromising security or privacy.
Gretel.ai provides tools for generating synthetic data that can be used to train AI models without compromising privacy and offers ways to label data and discover sensitive information embedded in models.
Opaque Systems focuses on confidential computing, which involves encrypting data while AI models use it, ensuring sensitive information stays protected even during processing.
Private AI develops tools for privacy-preserving natural language processing (NLP), enabling the use of sensitive text data in AI models without compromising private information.
Unanimous AI develops federated learning solutions that allow training AI models on decentralized datasets, keeping data private and secure while enabling collaboration.

Ethics: Biases in AI development can lead to discriminatory outcomes that amplify societal inequalities. Ethical guidelines are crucial to ensure responsible AI development and use. While initiatives from AI organizations are encouraging, a broader conversation is necessary to establish clear ethical principles and guard against unintended harm.

Parity specializes in algorithmic fairness solutions focused on detecting and mitigating bias in hiring and lending decisions. Their platform supports companies in building more equitable AI solutions.
Veritas provides tools for assessing and mitigating risks within AI models, including bias detection, fairness assessment, and explainability solutions. They help companies understand the ethical implications of their models.

Security: Imagine AI manipulated to create vulnerabilities, disrupt critical infrastructure, or launch cyberattacks. These chilling scenarios need robust security measures. While frameworks like the EU’s GDPR are positive steps, a more nuanced approach is needed to ensure consistent protection against evolving threats.

Arthur.ai built “The First Firewall for LLMs” to help companies deploy their LLMs confidently and safely.
CalypsoAI helps organizations embrace GenAI rapidly without the risk to organizational safety and security.
Robust Intelligence offers real-time protection and validation of AI models and data.
SentinelOne – endpoint security platform specifically designed to tackle the challenges of AI model security. Their solution utilizes real-time behavioral analysis to detect and prevent malicious AI actions.

Regulation: The evolving regulatory landscape presents its own challenges. While fostering innovation, the current patchwork approach raises concerns about consistency and potential loopholes. A comprehensive regulatory framework is crucial to balance innovation with ethical considerations, providing clear boundaries for responsible AI development and deployment.

Fairly.ai built a platform-agnostic collaboration and audit tool that provides quality assurance for automated decision-making systems with a focus on fairness and compliance.
Ketch provides tools to manage and automate consent collection, data mapping, and compliance with privacy regulations.
Transcend offers a privacy platform to help companies comply with privacy regulations (e.g., CCPA, GDPR). They have features specifically designed for AI, such as the ability to automate data discovery and risk assessments for AI models.

What Can We Do About It?

Stay informed about evolving regulations, particularly EU guidelines, which we expect to set a precedent for other regions, including the US. When practical, have a dedicated individual or team to follow regulations and anticipate trends.
Recognize that satisfying regulatory requirements will not be enough. Effective regulations most often address known risks rather than predicting new ones. Learn to operate in an environment in which regulation is always behind the speed of innovation and its applications are not consistent across geographies.
Prioritize transparency in the absence of regulatory guidance. Develop internal quality measures and communicate to users how you safeguard privacy and monitor AI outputs. Understand your vendors’ privacy and data sharing policies and make sure they align with yours (e.g., Amazon Bedrock does not use your data to train the models, and it’s always isolated per customer, so your data privacy and security remains intact).

Third-Party Dependence: When the Black Box Changes

Many AI products rely heavily on pre-trained models and APIs from third parties. This offers convenience and efficiency but at a cost. The inner workings of these components can be opaque, creating a black box that hinders explainability and control. Imagine using a powerful text generation API in your product. An update you cannot control might alter its behavior, potentially affecting your product’s output and functionality. Both researchers and practitioners observed that “behavior of the ‘same’ LLM service can change substantially in a relatively short amount of time.” This reliance adds another layer of risk, demanding careful evaluation and contingency planning.

What Can We Do About It?

Vet third-party models and APIs thoroughly. Deep dive into component internals, evaluate update policies, analyze vendors’ track records, and favor those who invest in interpretability and explainability as part of their model development (e.g., Gemini 1.5, available at https://gemini.google.com, explains how it generated a summary or rewrote a draft).
Diversify and secure alternatives. Use multiple vendors, explore open source models, and negotiate and document service-level agreements with providers.
Build monitoring and logging to track the usage and performance of your application to detect changes quickly.
Regularly reassess dependencies and conduct contingency planning.

Economics: Justifying Cost and Benefit

Developing and operating AI products demands significant resources. Training complex models requires robust computing power, easily swallowing up hefty cloud computing bills. Relying on third-party services adds further financial considerations, like subscription fees for APIs or licensing costs for pre-trained models. On top of that, the uncertainty surrounding future developments and potential regulation changes makes monetization a delicate dance. A seemingly viable product today, like an AI-powered personalized shopping assistant, might face financial strain tomorrow due to unforeseen cost fluctuations, such as a spike in demand for compute power as the user base grows, or new regulations requiring expensive data anonymization practices.

The cost of training ML models and running inference must be evaluated for any Machine Learning project. According to the Wall Street Journal article, The 9-Month-Old AI Startup Challenging Silicon Valley’s Giants, the Paris-based Mistral AI’s model cost less than €20 million to train. WSJ wrote, “By contrast OpenAI Chief Executive Sam Altman said last year after the release of GPT-4 that training his company’s biggest models cost ‘much more than’ $50 million to $100 million.”

Ritesh Vajariya, a salesperson from Amazon Web Services, made some back-of-the-envelope estimations of the AI costs for a small ML project in his article The Cost of AI: Should You Build or Buy Your Foundation Model? He estimated a task involving off-the-shelf AI models from OpenAI and Meta. The task was:

Summarizing 58,200 public companies’ annual reports, each estimated at 55,000 words.
Summarizing 3 quarterly reports from the same companies.
Transcribing 10,000 words for each of the reports.
Generating a 10,000-word call transcription sentiment analysis.

Using AWS Bedrock with the open-source Meta LlaMA 2 model, he estimated the cost would be ~$4K compared to ~$14K using OpenAI’s GPT 3.5-turbo model.

What Can We Do About It?

Clearly justify the specific value that AI delivers compared to deterministic algorithms in traditional software, which will be faster and cheaper to build, test, deploy, and iterate.
Implement model compression techniques to reduce compute power needs during training and inference.
Design modular architectures that can adapt to changing hardware requirements and API updates.

In Conclusion: Embrace AI with Informed Caution

Despite the risks, we are incredibly optimistic that AI will transform products for the better. There is much to be hopeful for–we are already making great strides in research and development to build more robust, transparent, and explainable models. Regulation and policies to safeguard privacy, prevent misuse, and promote responsible development are underway. By fostering responsible development, we can transform the Wild West of AI into a thriving frontier of technological advancement, shaping a future where both humans and AI flourish.

What Do You Think?

How do you assess and mitigate risk related to your AI work? We would love to hear from you in the comments.

About Neuronn AI

Neuronn AI is a professional advisory group rooted in prominent Artificial Intelligence companies such as Meta. Leveraging extensive expertise in Data Science & Data Engineering, AI Product Management, Marketing Research, Strategic and Brand Marketing, Finance, and Program Management, we offer fractional assistance to startups looking to launch AI-driven product ideas and provide consulting services to enterprises seeking to implement AI to accelerate their outcomes. Neuronn AI’s advisors include Alex Kalinin, Karen Trachtenberg, Enrique Ortiz, Lacey Olsen, Norman Lee, Owen Nwanze-Obaseki, Rick Gupta, Seda Palaz Pazarbasi, Sid Palani and Vik Chaudhary.

The Wild West of AI Innovation

Accuracy, Bias, and Quality: When Hallucinations Outweigh Truth

The Precarious Balance: Privacy, Ethics, Security, and Regulation

Third-Party Dependence: When the Black Box Changes

Economics: Justifying Cost and Benefit

In Conclusion: Embrace AI with Informed Caution

What Do You Think?

About Neuronn AI

Leave a Reply Cancel reply

ABOUT THE Author

Vik Chaudhary

Follow Us

Popular Tags

Top Categories