AI Ethics was a complicated topic even before ChatGPT (and other Large Language Models – LLMs) and Generative AI burst onto the scene. In this article – we cover how LLMs and Generative AI have changed the Ethics landscape, and what businesses can do to keep up and keep going.
AI Ethics is a vast and complex topic. At a high level – the intent of Ethical AI is to ensure systems and technology are created and operate in line with human value systems and concern for the environment. Within this broad goal, AI Ethics consists of several components as outlined here.
Where do Large Language Models fit in?
Large Language Models have all of the AI Ethics concerns and amplify many of them. Given how pervasive they have become and in such a short time, the ethics issues become even more relevant. A few examples of the ethics issues of LLMs:
Note – I do not include hallucinations since mistakes are part and parcel of every AI model.
Where does Generative AI fit in?
The relationship between Generative AI and LLMs is defined differently depending on who you ask – but one good definition is that LLMs are a subset of generative AI in that it generates text, but LLMs also have strong expertise in text-based queries. For the purpose of this article, we can consider the ethical challenges of Generative AI to include all those of LLMs plus additional challenges related to models that generate content in multiple modalities (imagery, video, sound, etc.).
- Content ownership: Lawsuits have begun about content ownership, with topics ranging across all content modalities. The focus is particularly strong in the creator economy where generative AI can upend entire traditional business models. This applies to all media types including text.
- Misinformation: Again applying to all media types including text. Generative AI can create very realistic fake imagery, news articles, etc. that are increasingly challenging to detect. Furthermore, generative AI and LLM-powered chatbots can create a new era of social media interference.
What do the laws say?
The laws in this space are still very nascent. For example, it is unclear who owns copyright to an AI-generated artwork. There is debate in the European Union regarding ChatGPT and the potential violation of the GDPR. As noted in the above section, lawsuits have also begun regarding ownership of content, which will then allow the legal system an opportunity to offer case law assessments of content ownership.
What new technologies are being developed?
Some examples of technology that is being developed to counter these issues
- Variants of human feedback incorporated into learning. Reinforcement Learning with Human Feedback (RLHF) is a process by which humans provide feedback on the outputs of – for example – an LLM. These feedbacks are used to build a policy engine that the LLM can use to select the candidate outputs that are most acceptable to humans. ChatGPT is widely believed to use RLHF, while a competitor, Anthropic, has proposed an alternative – Constitutional AI, where a structured rules system representing human values/preferences is used in place of active human feedback. In both cases – the technology’s intent is to align the AI’s outputs with human values and preferences.
- Unlearning. A key to maintaining privacy and good data practices is to control (and be aware of) what data goes into an AI model. Unlearning is the technology where AI models can “forget”, or eliminate from its understanding, selected data elements. The importance of advancing Unlearning is highlighted by Google’s recent announcement of a competition to drive new unlearning advances.
What can you do for your business?
There is a lot that is still unknown about these technologies, but there are a few things that you can do to protect your business.
- Understand the source of your AI technologies. If they are being built in-house – understand what data (particularly customer data they are using). For example – if a customer decides to opt out later and requests that all their data be removed from your models – do you know how to do that? If you are using external APIs (ie models created and provided by other parties) – understand what data you are providing them even in your queries, and whether that is acceptable to your business.
- Make build vs. buy decisions as appropriate. For example – generic (non-private) queries that are best answered by a large model may be suitable for licensed APIs rather than training a massive generic model in-house. Sensitive queries (whether they contain customer private data or your own IP) may benefit from custom-built internal models. Even when an internal model is built – make sure to understand the source – since many models built in code often use a starter model downloaded from the internet.
- Stay up to date on the latest legal developments as they pertain to your business domain. This space is moving very rapidly with no clear answers and the outcomes of these developments will shed more light on government opinion.
Read the full article here