INDEX
Explanations
proper nouns related to notable figures or entities
names of individuals, particularly in a context related to sports or entertainment
New Auto-Interp
Negative Logits
Newt
-0.70
ktop
-0.70
itiveness
-0.70
HMS
-0.69
Sussex
-0.69
culosis
-0.68
terday
-0.66
PLIED
-0.65
Eliot
-0.64
Appalach
-0.63
POSITIVE LOGITS
ondo
0.91
aldi
0.89
á
0.89
ña
0.88
acc
0.85
ello
0.85
acho
0.83
arez
0.83
uz
0.82
iro
0.82
Activations Density 0.158%