INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    だが
    0.76
     آنچه
    0.66
     Öffentlichkeit
    0.64
    십니까
    0.61
    ようだ
    0.61
     دوسروں
    0.60
    শিল্প
    0.59
     поскольку
    0.59
    TDto
    0.59
     школ
    0.58
    POSITIVE LOGITS
    !
    1.25
     (
    1.20
    !!
    1.16
     kiddos
    1.09
     yummy
    1.08
     =
    1.02
     sooo
    1.02
    1.02
    1.01
     (!
    1.00
    Act Density 0.005%

    No Known Activations