INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    er
    1.38
    y
    1.26
    وه
    1.01
     فق
    0.98
    小程序
    0.96
    kt
    0.96
    ر
    0.95
    cz
    0.94
    н
    0.91
    ل
    0.91
    POSITIVE LOGITS
     preponderance
    1.35
     rematch
    1.33
     military
    1.29
     saloon
    1.29
     obscene
    1.28
     triplicate
    1.26
    expandedTitle
    1.26
     catastrophic
    1.24
     harrowing
    1.23
     loudspeaker
    1.22
    Act Density 0.001%

    No Known Activations