INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Work
    -0.09
    _month
    -0.08
     Trabajo
    -0.08
    _year
    -0.08
     sigh
    -0.08
    roll
    -0.08
    pager
    -0.08
     autoria
    -0.08
    dot
    -0.08
    entscheid
    -0.07
    POSITIVE LOGITS
    0.08
    赚钱
    0.08
    уск
    0.08
     gospod
    0.08
    0.08
     aanval
    0.08
     grij
    0.08
     opportun
    0.07
    Emb
    0.07
     Paste
    0.07
    Act Density 0.002%

    No Known Activations