INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     pounding
    -0.07
    _books
    -0.07
    Cadastro
    -0.06
     nhân
    -0.06
    Forbidden
    -0.06
     очень
    -0.06
     но
    -0.06
    ίγ
    -0.06
     fontWithName
    -0.06
     visibility
    -0.06
    POSITIVE LOGITS
    ']>;↵
    0.07
     gerne
    0.07
     thesis
    0.06
    (""))
    0.06
    리는
    0.06
     starring
    0.06
    WSTR
    0.06
    .TRAILING
    0.06
    _WP
    0.06
     Kobe
    0.06
    Act Density 0.011%

    No Known Activations