INDEX
Explanations
the mention of specific names, especially related to sports and entertainment
names of people, particularly in contexts involving competition or notable events
New Auto-Interp
Negative Logits
cap
-0.88
gered
-0.84
enium
-0.81
iage
-0.80
yll
-0.79
ership
-0.77
oria
-0.76
seless
-0.74
rec
-0.74
yrim
-0.73
POSITIVE LOGITS
Browne
0.89
agher
0.80
millenn
0.79
Barnett
0.73
livest
0.72
igham
0.71
staking
0.70
ignty
0.68
Pwr
0.68
challeng
0.66
Activations Density 0.017%