INDEX
Explanations
names of individuals, particularly those associated with news or events
names of individuals, particularly those associated with notable events or figures
New Auto-Interp
Negative Logits
Solitaire
-0.84
Goose
-0.78
bucks
-0.76
naire
-0.74
Colossus
-0.73
sburgh
-0.71
Scotch
-0.70
arget
-0.70
deform
-0.69
boulder
-0.69
POSITIVE LOGITS
ovic
1.05
Rahman
1.05
Hussain
1.03
iyah
1.02
ibn
0.98
Ahmed
0.96
Sheikh
0.95
Mahm
0.95
Ahmad
0.92
Ali
0.87
Activations Density 0.013%