INDEX
Explanations
Proper nouns related to people or places
proper nouns related to individuals, political groups, or specific entities
New Auto-Interp
Negative Logits
izons
-0.77
¥µ
-0.77
multiple
-0.70
ãĤ¦
-0.70
izoph
-0.68
atorium
-0.67
MpServer
-0.66
sample
-0.66
etheless
-0.62
ļéĨĴ
-0.62
POSITIVE LOGITS
fault
1.22
who
0.94
versus
0.93
alone
0.90
deciding
0.88
vs
0.86
doing
0.83
Fault
0.81
messing
0.77
whom
0.77
Activations Density 0.252%