INDEX
Negative Logits
hurting
-0.07
quad
-0.07
retrospect
-0.06
Aid
-0.06
<strong
-0.06
Gay
-0.06
daß
-0.06
sthrough
-0.06
عام
-0.06
Thông
-0.06
POSITIVE LOGITS
=df
0.07
astreet
0.07
Perf
0.06
erglass
0.06
đậu
0.06
тис
0.06
reinforced
0.06
="#
0.06
perme
0.06
&_
0.06
Activations Density 0.010%