INDEX
Explanations
references to personal experiences of pain
New Auto-Interp
Head Attr Weights
0:0.06
1:0.09
2:0.07
3:0.08
4:0.09
5:0.09
6:0.09
7:0.07
8:0.08
9:0.07
10:0.08
11:0.09
Negative Logits
letters
-2.66
auri
-2.59
oenix
-2.40
lines
-2.25
itals
-2.25
vell
-2.24
rin
-2.23
ources
-2.22
uine
-2.20
catentry
-2.13
POSITIVE LOGITS
misunder
2.55
shenan
2.52
cheating
2.23
privilege
2.19
accidentally
2.19
soDeliveryDate
2.17
SPONSORED
2.15
bash
2.15
cheat
2.15
stuffing
2.14
Activations Density 0.000%