INDEX
Explanations
proper names or entities, likely related to sports or news
names of individuals, particularly those involved in discussions or notable events
New Auto-Interp
Negative Logits
isure
-0.78
ewater
-0.78
lli
-0.76
atism
-0.75
agall
-0.75
emonium
-0.75
awaru
-0.75
estones
-0.74
packing
-0.74
starter
-0.73
POSITIVE LOGITS
Jeremiah
1.05
Uriel
0.87
Grayson
0.82
Ezekiel
0.81
Ezra
0.80
Vaughan
0.79
miah
0.78
Amos
0.77
Koen
0.77
Hernandez
0.76
Activations Density 0.010%