INDEX
Explanations
phrases related to movement or action
New Auto-Interp
Head Attr Weights
0:0.09
1:0.09
2:0.08
3:0.09
4:0.08
5:0.07
6:0.07
7:0.07
8:0.09
9:0.07
10:0.08
11:0.09
Negative Logits
defeats
-2.31
bombings
-2.19
wrapper
-2.17
OTA
-2.13
Vaj
-2.12
airstrikes
-2.07
certification
-2.02
Typhoon
-1.96
UD
-1.95
::::::::
-1.94
POSITIVE LOGITS
mates
2.55
ettes
2.39
sonian
2.39
inge
2.32
enes
2.31
iles
2.21
ources
2.19
spons
2.17
archive
2.13
src
2.06
Activations Density 0.000%