INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    recen
    1.83
    𝚊
    1.82
     Auschwitz
    1.78
    ów
    1.78
    OrElse
    1.73
    𝚜
    1.71
     tune
    1.68
    𝚍
    1.64
     ενώ
    1.63
    ావ
    1.62
    POSITIVE LOGITS
     perpetrated
    1.67
    emptive
    1.66
    م
    1.54
    1.50
    oubt
    1.49
    1.49
    ਿਸ
    1.49
     lemmas
    1.49
     Statutes
    1.47
    ‌تر
    1.44
    Act Density 0.070%

    No Known Activations