INDEX
Explanations
proper nouns related to a particular brand or company
references to specific names or titles in the text
New Auto-Interp
Negative Logits
ernaut
-0.78
tremend
-0.70
bully
-0.65
pets
-0.65
furt
-0.64
oway
-0.64
reluct
-0.63
alogue
-0.60
-0.59
ascus
-0.59
POSITIVE LOGITS
ttes
0.81
Jub
0.71
Hades
0.70
zza
0.69
este
0.67
Thrust
0.67
Cipher
0.66
Throne
0.66
Osiris
0.65
Paradox
0.65
Activations Density 0.624%