INDEX
Explanations
mentions of royalty or princesses
references to "Princess" and related terms
New Auto-Interp
Negative Logits
iasm
-0.76
iton
-0.73
fiber
-0.64
ional
-0.64
abi
-0.64
Weber
-0.62
eval
-0.61
modem
-0.60
ab
-0.60
defense
-0.59
POSITIVE LOGITS
Princess
3.81
princess
2.53
Prin
2.30
Duchess
1.92
Empress
1.73
Prince
1.69
Queen
1.61
Prince
1.58
Goddess
1.57
Sorceress
1.51
Activations Density 0.014%