INDEX
Explanations
words indicating a lack of support or denial
New Auto-Interp
Head Attr Weights
0:0.07
1:0.04
2:0.11
3:0.12
4:0.02
5:0.04
6:0.07
7:0.21
8:0.13
9:0.03
10:0.07
11:0.05
Negative Logits
Swordsman
-1.38
Guards
-1.34
srfAttach
-1.30
Macro
-1.21
Pear
-1.17
Magikarp
-1.16
SetFontSize
-1.13
Regiment
-1.12
ORGE
-1.11
FontSize
-1.11
POSITIVE LOGITS
ansky
1.34
ramer
1.23
arent
1.17
itate
1.15
agame
1.13
inger
1.13
andom
1.11
lish
1.10
activate
1.08
etsk
1.06
Activations Density 0.033%