INDEX
Negative Logits
Ŀ
-0.65
uci
-0.62
nit
-0.62
kn
-0.61
thumbnails
-0.61
shirts
-0.61
letters
-0.61
ships
-0.59
Texans
-0.59
leaders
-0.59
POSITIVE LOGITS
worldly
0.79
depending
0.76
lobe
0.74
wart
0.74
vowel
0.74
isphere
0.74
hemisphere
0.73
baseman
0.73
consecut
0.71
dayName
0.70
Activations Density 0.057%