Cossil tests AI applications by conducting Red Teaming and Penetration testing of the AI application including the LLM model
Overview of vulnerabilities audited
1.Bias
and stereotypes
Causes:
-
Implicit bias present in the foundation model
-
Wrong document used to build the answer
2. Sensitive
information disclosure
Causes:
-
Inclusion of sensitive data in the documents
available to the chatbot
-
Inclusion of private information on the prompt
which get leaked
3.
Service disruption
Causes:
-
Large number of requests
-
Long requests
-
Crafted requests
Consequence:
app is unavailable for legitimate users or enormous costs are incurred
4. Hallucinations
Causes:
-
Suboptimal retrieval mechanism
-
Low quality documents get misinterpreted by the
LLM
-
LLM tendency to never contradict the user
We are testing the robustness, fairness and ethical boundaries of LLM
systems by using the following techniques:
1.
Exploiting text completion
2.
Using biased prompts
3.
Direct prompt injection
4.
Gray box prompt attacks (This is a different way to bypass safeguards: completely
reshape the prompt given that you know the structure of the prompt.)
5. Advanced technique: Prompt probing (The advanced way to bypass safeguards is to try to discover the system prompt)