INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _blocked
    -0.07
    ider
    -0.06
     medals
    -0.06
     dealer
    -0.06
    	cache
    -0.06
    .listener
    -0.06
    -0.06
     září
    -0.06
     procurement
    -0.06
     vocab
    -0.06
    POSITIVE LOGITS
    requires
    0.07
    0.07
     Magn
    0.06
     Keeping
    0.06
     Franken
    0.06
    -La
    0.06
     niệm
    0.06
     باق
    0.06
     cevap
    0.06
    erdale
    0.06
    Act Density 0.044%

    No Known Activations