INDEX
Explanations
words ending with the letters 'ts' and 'es'
New Auto-Interp
Negative Logits
ously
-0.69
decom
-0.59
Netanyahu
-0.59
visitation
-0.58
evolution
-0.57
manipulation
-0.57
BDS
-0.56
resistance
-0.55
nerv
-0.55
Modi
-0.55
POSITIVE LOGITS
hire
1.26
poon
1.20
ilver
1.12
pring
1.12
omething
1.12
dale
1.12
bury
1.08
poons
1.04
cript
1.02
hare
1.02
Activations Density 0.035%