INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _GAIN
    -0.07
    .bias
    -0.07
    ACL
    -0.07
    _tick
    -0.07
    .Touch
    -0.07
     cart
    -0.06
    лади
    -0.06
     slou
    -0.06
     ALOG
    -0.06
     میلیون
    -0.06
    POSITIVE LOGITS
    лены
    0.07
    Love
    0.06
    (prog
    0.06
    comma
    0.06
     demok
    0.06
     electronics
    0.06
    ........
    0.06
    ,K
    0.06
    		 	
    0.06
     Journalism
    0.06
    Act Density 0.001%

    No Known Activations