INDEX
Explanations
references to members of the royal family, specifically the word "Prince"
New Auto-Interp
Negative Logits
anmar
-0.17
ctype
-0.16
povol
-0.15
isoft
-0.15
agogue
-0.15
memberOf
-0.15
RITE
-0.15
Garland
-0.15
IBOutlet
-0.14
пÑĢиÑĤ
-0.14
POSITIVE LOGITS
(ss
0.25
/ss
0.24
esses
0.23
ps
0.22
ess
0.20
Consort
0.20
charming
0.19
essa
0.19
ippet
0.18
еÑģÑģ
0.18
Activations Density 0.009%