INDEX
Explanations
names of specific individuals in the context of sports
references to specific athletes and their performance statistics
New Auto-Interp
Negative Logits
berus
-0.80
uously
-0.79
uous
-0.73
izards
-0.72
ridor
-0.71
okers
-0.70
roll
-0.68
edo
-0.68
pelling
-0.68
ace
-0.65
POSITIVE LOGITS
ĺħ
0.80
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
0.75
Ago
0.71
thood
0.70
preached
0.69
ership
0.67
Lama
0.66
ONT
0.65
¶ħ
0.65
doms
0.64
Activations Density 0.045%