INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     dipl
    -0.08
     તમામ
    -0.08
    -elect
    -0.07
     entrando
    -0.07
     Lag
    -0.07
    _forms
    -0.07
     છતાં
    -0.07
     Hobbit
    -0.07
     target
    -0.07
     Legisl
    -0.07
    POSITIVE LOGITS
     commas
    0.10
     punctuation
    0.09
     промеж
    0.08
    wards
    0.08
     לח
    0.08
     ‘’
    0.08
     مربع
    0.07
    !,
    0.07
     radicals
    0.07
     nowrap
    0.07
    Act Density 0.007%

    No Known Activations