INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     eater
    -0.07
     swift
    -0.06
    	pos
    -0.06
     eval
    -0.06
     aktar
    -0.06
    debug
    -0.06
    .area
    -0.06
     Πρό
    -0.06
     intl
    -0.06
     Machine
    -0.06
    POSITIVE LOGITS
    0.06
    ’dan
    0.06
    овід
    0.06
    örper
    0.06
    _ETH
    0.06
    0.06
    ambi
    0.06
     joys
    0.06
    ��
    0.06
    역시
    0.06
    Act Density 0.073%

    No Known Activations