INDEX
Explanations
references to royalty or titles associated with royalty
mentions of various princes, particularly focusing on the name "Prince."
New Auto-Interp
Negative Logits
KT
-0.66
bers
-0.65
fram
-0.64
anwhile
-0.63
ãĤī
-0.61
£ı
-0.61
dispatch
-0.60
onica
-0.59
çĦ
-0.59
bered
-0.59
POSITIVE LOGITS
loo
0.97
doms
0.91
cipled
0.89
Rupert
0.84
Prince
0.78
Clause
0.77
Albert
0.76
VPN
0.75
afort
0.75
pin
0.74
Activations Density 0.015%