INDEX
Explanations
names of historical figures or royalty, particularly related to kings
mentions of royalty, particularly the term "King."
New Auto-Interp
Negative Logits
ATIONAL
-0.75
umbai
-0.70
ename
-0.70
TING
-0.69
Working
-0.69
tsy
-0.68
eredith
-0.67
JUST
-0.66
schild
-0.66
chell
-0.65
POSITIVE LOGITS
uin
1.07
lord
1.02
King
0.99
fish
0.98
holder
0.92
Abdullah
0.88
doms
0.88
pin
0.87
osaurs
0.86
STON
0.86
Activations Density 0.009%