INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     lain
    -0.07
    _students
    -0.06
    ontology
    -0.06
    ilip
    -0.06
     forgotten
    -0.06
    /car
    -0.06
     olmam
    -0.06
    _pet
    -0.06
     внеш
    -0.06
    veget
    -0.06
    POSITIVE LOGITS
    -click
    0.08
     lingering
    0.06
    _reduction
    0.06
    (kind
    0.06
    (init
    0.06
    }.${
    0.06
     );
    ↵
    0.06
     ISPs
    0.06
    ->[
    0.06
    	initialize
    0.06
    Act Density 0.020%

    No Known Activations