INDEX
Explanations
mentions of the word "Queen."
mentions of royalty, specifically the term "Queen"
New Auto-Interp
Negative Logits
odcast
-0.75
razil
-0.73
kson
-0.72
sych
-0.68
aping
-0.65
iance
-0.65
aneous
-0.64
ERSON
-0.62
letcher
-0.62
herer
-0.61
POSITIVE LOGITS
Anne
1.00
Queen
0.93
stown
0.89
pin
0.88
Majesty
0.78
pins
0.78
Anne
0.77
Mother
0.77
Queen
0.76
esses
0.75
Activations Density 0.009%