INDEX
    Explanations

    self-defense justification or necessity

    New Auto-Interp
    Negative Logits
     banal
    0.48
    安定
    0.44
     indiscrimin
    0.44
     ఆదేశ
    0.43
    revenue
    0.43
     zeta
    0.41
     stabilise
    0.41
     pasando
    0.40
     hate
    0.40
     callous
    0.40
    POSITIVE LOGITS
    łą
    0.45
     stranded
    0.43
    ıları
    0.43
     nécessité
    0.38
     ತು
    0.38
     appropriately
    0.38
     फं
    0.38
    0.37
     American
    0.36
     нужда
    0.36
    Act Density 0.040%

    No Known Activations