INDEX
    Explanations

    death, penalty, threats, core

    New Auto-Interp
    Negative Logits
     Wren
    1.00
    南京
    0.98
     broken
    0.97
    0.94
    aec
    0.93
    ação
    0.90
     torn
    0.84
     massac
    0.84
     vomit
    0.84
    anity
    0.83
    POSITIVE LOGITS
     penalty
    1.12
    penalty
    1.05
    Penalty
    1.04
     Penalty
    1.03
     мозга
    1.02
    せる
    0.98
    posaż
    0.96
    bed
    0.92
    مة
    0.92
    0.91
    Act Density 0.089%

    No Known Activations