INDEX
Explanations
words related to sports actions or performance
New Auto-Interp
Head Attr Weights
0:0.09
1:0.08
2:0.09
3:0.07
4:0.08
5:0.07
6:0.07
7:0.09
8:0.08
9:0.08
10:0.07
11:0.08
Negative Logits
ゼ
-2.31
-2.27
poral
-2.01
Ke
-2.00
Word
-1.90
stellar
-1.89
フ
-1.89
ometown
-1.87
word
-1.86
Topics
-1.86
POSITIVE LOGITS
destro
2.10
abort
1.99
intent
1.92
arming
1.89
beware
1.89
prevention
1.87
ggle
1.85
caveat
1.82
etheless
1.81
unde
1.81
Activations Density 0.000%