AI application safety and security

Cossil tests AI applications by conducting Red Teaming and Penetration testing of the AI application including the LLM model

Overview of vulnerabilities audited

1.Bias and stereotypes

Causes:

-        Implicit bias present in the foundation model

-        Wrong document used to build the answer

2. Sensitive information disclosure

Causes:

-        Inclusion of sensitive data in the documents available to the chatbot

-        Inclusion of private information on the prompt which get leaked

3. Service disruption

Causes:

-        Large number of requests

-        Long requests

-        Crafted requests

Consequence: app is unavailable for legitimate users or enormous costs are incurred

 4. Hallucinations

Causes:

-        Suboptimal retrieval mechanism

-        Low quality documents get misinterpreted by the LLM

-        LLM tendency to never contradict the user

We are testing the robustness, fairness and ethical boundaries of LLM systems by using the following techniques:

1.    Exploiting text completion

2.    Using biased prompts

3.    Direct prompt injection

4.    Gray box prompt attacks (This is a different way to bypass safeguards: completely reshape the prompt given that you know the structure of the prompt.)

          5.    Advanced technique: Prompt probing (The advanced way to bypass safeguards is to try to discover the system prompt)