INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Japanese
    -0.07
     bunker
    -0.07
    .sigmoid
    -0.06
     corros
    -0.06
     kultur
    -0.06
     konz
    -0.06
     tabBar
    -0.06
     прис
    -0.06
     Peters
    -0.06
     Ninth
    -0.06
    POSITIVE LOGITS
     analogue
    0.09
    _not
    0.07
    γκ
    0.07
    *width
    0.06
    .Visible
    0.06
    	                       
    0.06
     nearest
    0.06
     Như
    0.06
    ош
    0.06
    로그
    0.06
    Act Density 0.003%

    No Known Activations