INDEX
    Explanations

    phrases that indicate potential threats or dangers

    New Auto-Interp
    Negative Logits
    thâu
    -0.47
     Band
    -0.43
     disambiguazione
    -0.41
    ANNES
    -0.41
     démocr
    -0.40
     bandoulière
    -0.40
     Interpre
    -0.40
    indro
    -0.40
     judiciaire
    -0.40
    -0.39
    POSITIVE LOGITS
     threat
    1.10
     threats
    1.02
    threat
    1.00
    Threat
    0.98
     danger
    0.98
     Threat
    0.96
    Threats
    0.93
     amenaza
    0.90
     Gefahr
    0.87
     Threats
    0.87
    Act Density 0.066%

    No Known Activations