INDEX
Explanations
gerunds or present participles
New Auto-Interp
Head Attr Weights
0:0.08
1:0.04
2:0.08
3:0.09
4:0.08
5:0.07
6:0.08
7:0.08
8:0.08
9:0.07
10:0.10
11:0.09
Negative Logits
CONTR
-2.10
STORY
-1.88
STATS
-1.81
CLIENT
-1.80
ENDED
-1.70
SERVICES
-1.68
Editorial
-1.66
PATH
-1.63
Reviewer
-1.62
LIFE
-1.60
POSITIVE LOGITS
auga
1.63
surprise
1.54
lyak
1.54
disguised
1.53
atz
1.52
thia
1.50
Qin
1.50
atus
1.49
pse
1.48
dq
1.48
Activations Density 0.000%