INDEX
    Explanations

    foreign languages

    New Auto-Interp
    Negative Logits
    industry
    -0.08
     ранее
    -0.08
    alphabet
    -0.08
     활동
    -0.08
    .utilities
    -0.08
    chips
    -0.08
    alias
    -0.08
    design
    -0.08
    chim
    -0.07
    -0.07
    POSITIVE LOGITS
     LO
    0.10
     contrap
    0.09
     sanc
    0.08
    -Lo
    0.08
     Sculpt
    0.08
     decl
    0.08
    -bootstrap
    0.08
    -max
    0.08
     prejudice
    0.07
     yo
    0.07
    Act Density 0.023%

    No Known Activations