INDEX
Explanations
mentions of members of the royal family, particularly individuals such as Prince Andrew and Prince Philip
mentions of royalty or specific princes
New Auto-Interp
Negative Logits
ologies
-0.73
fram
-0.72
ger
-0.70
selves
-0.69
visors
-0.69
RD
-0.67
ppo
-0.67
bered
-0.66
lder
-0.65
gers
-0.64
POSITIVE LOGITS
Rupert
1.02
William
0.92
Charles
0.90
George
0.89
Albert
0.82
Phillip
0.81
Philip
0.79
Arthur
0.79
Clause
0.78
pengu
0.77
Activations Density 0.026%