INDEX
    Explanations

    references to violence or conflict involving authority figures

    New Auto-Interp
    Negative Logits
     Autorisations
    -0.40
     Autorizaciones
    -0.36
    र्भ
    -0.35
    raulic
    -0.31
    -0.30
    /#{
    -0.29
     '__
    -0.29
    Ladd
    -0.29
    apt
    -0.28
     apt
    -0.28
    POSITIVE LOGITS
     للاسماء
    0.68
     killing
    0.67
     assassinated
    0.62
     indígen
    0.61
     assassination
    0.60
    Killing
    0.58
     surla
    0.58
     Geſ
    0.57
    DeleteMapping
    0.57
     houſe
    0.57
    Act Density 0.055%

    No Known Activations