INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     many
    -0.08
     one
    -0.08
     a
    -0.07
    )a
    -0.07
     One
    -0.07
     Yugoslavia
    -0.06
     an
    -0.06
    tbl
    -0.06
    $a
    -0.06
    adar
    -0.06
    POSITIVE LOGITS
     være
    0.07
     Floors
    0.07
     Work
    0.07
     genuine
    0.06
     openness
    0.06
     quindi
    0.06
     Chocolate
    0.06
     생각
    0.06
     декабря
    0.06
     Crusher
    0.06
    Act Density 0.206%

    No Known Activations