INDEX
Explanations
names of notable people or references to prominent figures
New Auto-Interp
Negative Logits
STANCE
-0.18
older
-0.17
luck
-0.15
OLDER
-0.15
dn
-0.15
اÙĪØ±
-0.15
OLON
-0.15
ullo
-0.14
olor
-0.14
ential
-0.13
POSITIVE LOGITS
querque
0.27
azeera
0.23
Al
0.18
veriÅŁ
0.16
gren
0.16
andro
0.15
Clock
0.15
clock
0.15
olen
0.15
ivia
0.15
Activations Density 0.062%