INDEX
Explanations
references to historical and biblical figures and events
New Auto-Interp
Negative Logits
aign
-0.14
¤¤
-0.14
Spart
-0.14
rame
-0.14
rof
-0.14
conds
-0.14
pawn
-0.14
em
-0.14
.sb
-0.13
ÏĥÏĢ
-0.13
POSITIVE LOGITS
Solomon
0.21
Sol
0.19
Nathan
0.18
inton
0.16
king
0.16
David
0.15
Wolfe
0.15
森
0.15
David
0.14
King
0.14
Activations Density 0.029%