INDEX
Explanations
words related to personal names
words related to alcoholic beverages or distillation
New Auto-Interp
Negative Logits
lance
-0.84
awei
-0.70
rences
-0.69
Seym
-0.68
population
-0.66
eas
-0.66
ingly
-0.65
kward
-0.65
spin
-0.64
aepernick
-0.64
POSITIVE LOGITS
osaurs
1.03
ned
1.02
ny
0.98
fo
0.97
ergic
0.93
eman
0.92
strument
0.91
itis
0.90
jad
0.85
igans
0.84
Activations Density 0.148%