INDEX
    Explanations

    mentions of safety and related concepts

    New Auto-Interp
    Negative Logits
    initializeApp
    -0.39
    reactstrap
    -0.37
     defaultstate
    -0.35
    createStatement
    -0.34
     Gables
    -0.34
     архивлан
    -0.33
    inghouse
    -0.32
    uta
    -0.32
     ويكي
    -0.32
    piew
    -0.32
    POSITIVE LOGITS
     safety
    4.41
    Safety
    4.13
     Safety
    4.13
    safety
    4.09
     SAFETY
    3.80
    SAFETY
    3.58
    afety
    2.95
    安全
    2.69
     veiligheid
    2.59
     安全
    2.38
    Act Density 0.095%

    No Known Activations