INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .RegisterType
    -0.08
    entry
    -0.07
    íme
    -0.07
    =}
    -0.07
     Swimming
    -0.07
    	dir
    -0.06
     weaving
    -0.06
    rus
    -0.06
    ιν
    -0.06
     Claude
    -0.06
    POSITIVE LOGITS
     shock
    0.10
     shocking
    0.09
    Shock
    0.09
     Shock
    0.08
     shocks
    0.08
    edException
    0.07
     shocked
    0.07
     accepts
    0.07
     approximation
    0.07
    ;(
    0.07
    Act Density 0.006%

    No Known Activations