INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    rego
    -0.09
    –↵
    -0.09
    eyes
    -0.08
     envy
    -0.08
     malheureusement
    -0.08
     isip
    -0.07
    tun
    -0.07
     unfortunately
    -0.07
     additions
    -0.07
     halda
    -0.07
    POSITIVE LOGITS
     inden
    0.09
     Temp
    0.08
    Temp
    0.08
     acel
    0.08
     taxpayer
    0.08
     kerk
    0.07
    nonce
    0.07
     Além
    0.07
    0.07
    0.07
    Act Density 0.001%

    No Known Activations