INDEX
Negative Logits
,param
-0.08
awhile
-0.08
blockquote
-0.07
skyrocket
-0.07
rais
-0.07
Dương
-0.07
appointed
-0.07
pán
-0.07
-0.07
incremented
-0.06
POSITIVE LOGITS
self
0.09
selfish
0.09
Self
0.08
Sense
0.07
french
0.07
TF
0.07
SELF
0.07
SPI
0.07
0.07
self
0.06
Activations Density 0.040%