INDEX
Explanations
proper nouns, likely names of individuals
proper nouns, particularly names of people and entities
New Auto-Interp
Negative Logits
âĶĢâĶĢ
-0.84
etheless
-0.80
DCS
-0.71
BILITIES
-0.70
ournal
-0.70
imbabwe
-0.69
UTERS
-0.69
ailability
-0.68
ATIONAL
-0.68
ITIES
-0.68
POSITIVE LOGITS
hart
0.98
hoff
0.96
enberg
0.96
mann
0.91
meyer
0.91
gaard
0.90
stad
0.90
gren
0.90
burn
0.90
ley
0.88
Activations Density 0.223%