INDEX
Explanations
military-related terminology and actions
New Auto-Interp
Head Attr Weights
0:0.04
1:0.05
2:0.07
3:0.06
4:0.07
5:0.06
6:0.26
7:0.09
8:0.04
9:0.06
10:0.08
11:0.07
Negative Logits
omore
-1.53
woes
-1.41
womb
-1.33
liner
-1.30
bilt
-1.29
Dust
-1.29
jealous
-1.28
Reign
-1.24
mania
-1.24
////////////////////////////////
-1.24
POSITIVE LOGITS
letcher
1.41
eger
1.39
ggles
1.38
instruct
1.36
ogle
1.32
channel
1.32
cknow
1.26
Elk
1.26
advises
1.25
speaker
1.23
Activations Density 0.000%