INDEX
Explanations
significant verbs indicating action or movement
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.08
3:0.08
4:0.09
5:0.07
6:0.07
7:0.07
8:0.08
9:0.09
10:0.08
11:0.07
Negative Logits
advertising
-2.29
wagon
-2.26
onew
-2.25
jah
-2.23
iral
-2.23
fu
-2.21
gif
-2.15
=]
-2.14
laugh
-2.12
lance
-2.11
POSITIVE LOGITS
individually
2.05
successors
2.01
subdiv
2.01
analogue
2.01
ividual
1.99
randomized
1.97
sten
1.97
attained
1.97
reproduced
1.95
offending
1.91
Activations Density 0.000%