INDEX
    Explanations

    a safe and helpful AI assistant

    New Auto-Interp
    Negative Logits
     सुरक्षित
    0.50
     безопас
    0.43
    安全
    0.43
    安全性
    0.42
     safe
    0.40
     безопасность
    0.40
     safer
    0.40
     bezpeč
    0.39
    安全的
    0.38
     dreamt
    0.37
    POSITIVE LOGITS
     ​​
    0.47
     Hinweis
    0.45
     phishing
    0.43
     Hiking
    0.41
     adware
    0.41
     Transmission
    0.40
     leistungs
    0.40
     UEFI
    0.40
     Fallout
    0.39
     COMPENSATION
    0.39
    Act Density 0.010%

    No Known Activations