INDEX
Explanations
references to royal figures or locations
references to monarchy and royal titles
New Auto-Interp
Negative Logits
pos
-0.73
techn
-0.70
printf
-0.69
vert
-0.69
ãĥ£
-0.67
fix
-0.65
WAR
-0.65
roll
-0.65
VD
-0.65
Nordic
-0.63
POSITIVE LOGITS
Majesty
1.26
doms
1.05
DOM
0.90
conservancy
0.86
mares
0.86
perty
0.83
pard
0.76
oice
0.76
peror
0.75
forbid
0.74
Activations Density 0.021%