INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     kebutuhan
    -0.09
     agréable
    -0.09
    wanie
    -0.08
     sesuai
    -0.08
     angene
    -0.08
     disadvantage
    -0.08
    fx
    -0.08
     inconvenience
    -0.07
    仕事内容
    -0.07
     необходимые
    -0.07
    POSITIVE LOGITS
     illustrates
    0.16
     illustrating
    0.16
     illustrate
    0.13
     exempl
    0.13
     demonstrating
    0.13
     демон
    0.13
     demonstrates
    0.12
     demostrar
    0.12
     Illustr
    0.12
     Demonstr
    0.12
    Act Density 0.168%

    No Known Activations