INDEX
Explanations
proper nouns or names associated with notable individuals and events
New Auto-Interp
Negative Logits
uyomi
-0.76
vana
-0.74
ypes
-0.63
Leilan
-0.61
separat
-0.58
ratom
-0.58
maxwell
-0.57
inhibitor
-0.56
vulner
-0.56
inhibitors
-0.55
POSITIVE LOGITS
baugh
0.91
entin
0.83
uez
0.76
zl
0.76
eve
0.68
Santos
0.66
port
0.64
ford
0.64
dal
0.61
ez
0.60
Activations Density 0.016%