INDEX
Explanations
references to monsters in various contexts
New Auto-Interp
Negative Logits
orte
-0.16
gang
-0.16
istry
-0.15
prus
-0.15
icts
-0.15
ãĥ³ãĥ
-0.14
DonaldTrump
-0.14
usa
-0.14
agher
-0.14
Lennon
-0.14
POSITIVE LOGITS
ous
0.22
iem
0.17
lijke
0.16
ously
0.16
loff
0.16
liness
0.15
821
0.15
kad
0.15
owie
0.15
inth
0.15
Activations Density 0.025%