INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    0.39
    ırm
    0.39
    ьа
    0.39
    sd
    0.37
    なかなか
    0.37
    اب
    0.37
    াপা
    0.37
     jährlich
    0.37
    ϫ
    0.37
    ϭ
    0.36
    POSITIVE LOGITS
     Majority
    0.57
     Mojo
    0.48
     majority
    0.48
     Kotlin
    0.47
     Ammo
    0.46
     Android
    0.46
     Favor
    0.46
     Software
    0.46
     Paper
    0.45
     Paint
    0.45
    Act Density 0.127%

    No Known Activations