INDEX
Explanations
names of individuals or entities in quotes
proper nouns, specifically names of people or characters
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨ
-0.61
ĸļ
-0.60
Thumbnails
-0.58
tumblr
-0.54
destro
-0.51
Cerberus
-0.51
åĤ
-0.51
è¦ļéĨĴ
-0.51
*.
-0.51
womb
-0.50
POSITIVE LOGITS
ham
0.59
ente
0.59
sey
0.59
elli
0.58
han
0.58
mar
0.58
aj
0.58
anta
0.58
ena
0.57
hend
0.57
Activations Density 0.670%