INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     governors
    -0.68
     Balt
    -0.67
     progress
    -0.67
     strengthening
    -0.61
    天
    -0.60
    urally
    -0.59
    adra
    -0.59
     retali
    -0.58
     Goldberg
    -0.56
     cosmetic
    -0.56
    POSITIVE LOGITS
    /?
    1.19
    /#
    1.11
    /,
    1.04
    /-
    1.01
    /.
    0.96
    /_
    0.91
    tml
    0.85
    /+
    0.84
    biz
    0.83
    /)
    0.80
    Act Density 0.329%

    No Known Activations