INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     sacr
    -0.07
     homic
    -0.07
     sph
    -0.07
     spr
    -0.07
     Bren
    -0.07
    Business
    -0.06
    baum
    -0.06
     Kra
    -0.06
     wię
    -0.06
     vacc
    -0.06
    POSITIVE LOGITS
    eks
    0.08
    			    	
    0.07
     icon
    0.07
    notated
    0.07
     hook
    0.07
    0.07
     notify
    0.07
    Js
    0.07
    Stan
    0.06
    ıs
    0.06
    Act Density 0.009%

    No Known Activations