INDEX
Explanations
instances of imperative verbs or commands
New Auto-Interp
Head Attr Weights
0:0.07
1:0.07
2:0.07
3:0.08
4:0.10
5:0.10
6:0.07
7:0.08
8:0.08
9:0.08
10:0.08
11:0.08
Negative Logits
BASE
-2.33
mort
-2.09
phia
-2.05
orld
-2.03
アル
-1.97
Californ
-1.94
anmar
-1.91
assic
-1.90
atom
-1.88
ankind
-1.85
POSITIVE LOGITS
strides
2.04
✔
2.01
endorsement
1.97
Posted
1.97
Profile
1.96
ulpt
1.96
leaflets
1.94
Warranty
1.93
Shine
1.92
playable
1.92
Activations Density 0.000%