INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     switches
    -0.07
     affine
    -0.07
     cortical
    -0.07
    ="<?
    -0.07
     Urs
    -0.06
    ρέπει
    -0.06
     Yii
    -0.06
    ienda
    -0.06
    вати
    -0.06
     ACK
    -0.06
    POSITIVE LOGITS
     sense
    0.10
    ?a
    0.07
     natural
    0.07
    ศร
    0.06
     сест
    0.06
     Diy
    0.06
     },↵
    0.06
     consolidate
    0.06
    natural
    0.06
    0.06
    Act Density 0.003%

    No Known Activations