INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    imbledon
    -0.07
    갔다
    -0.07
    _targets
    -0.06
    _print
    -0.06
     юрид
    -0.06
    _TA
    -0.06
     яй
    -0.06
    emoc
    -0.06
     Wimbledon
    -0.06
    šky
    -0.06
    POSITIVE LOGITS
    	pthread
    0.07
     attributed
    0.07
    ‌انبار
    0.07
     Improve
    0.06
     apple
    0.06
     instruction
    0.06
     constantly
    0.06
     AVAILABLE
    0.06
     thanking
    0.06
    َك
    0.06
    Act Density 0.026%

    No Known Activations