INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     WI
    -0.08
     IQ
    -0.08
     będą
    -0.07
    -0.07
     WO
    -0.07
     jen
    -0.07
    остав
    -0.07
     bap
    -0.07
    -0.07
    oloģ
    -0.07
    POSITIVE LOGITS
    research
    0.09
     researching
    0.09
    Research
    0.08
     araştır
    0.08
     research
    0.08
    阅读
    0.08
     investigar
    0.08
     reviewing
    0.08
    Slow
    0.08
    tutorial
    0.08
    Act Density 0.004%

    No Known Activations