INDEX
Explanations
references to young individuals or children
New Auto-Interp
Negative Logits
ioneer
-0.17
ucci
-0.17
icari
-0.15
itto
-0.15
xt
-0.15
oui
-0.15
itant
-0.15
coni
-0.15
ease
-0.14
incinn
-0.14
POSITIVE LOGITS
(er
0.22
blood
0.19
ish
0.16
lings
0.16
erness
0.15
ening
0.15
ofday
0.15
-old
0.15
est
0.15
swick
0.15
Activations Density 0.035%