INDEX
Explanations
emphasized references to significant weight or heaviness across various contexts
New Auto-Interp
Negative Logits
jour
-0.17
ège
-0.16
atatype
-0.15
icina
-0.14
osen
-0.14
Ïħ
-0.13
HeaderValue
-0.13
_UTF
-0.13
enberg
-0.13
ï¸ı
-0.13
POSITIVE LOGITS
-duty
0.51
duty
0.42
Duty
0.39
weights
0.39
hitters
0.29
-weight
0.29
weight
0.29
-handed
0.27
hitter
0.26
wie
0.25
Activations Density 0.019%