INDEX
Explanations
articles and conjunctions that emphasize the importance or significance of a subject
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.11
3:0.08
4:0.13
5:0.02
6:0.33
7:0.08
8:0.03
9:0.03
10:0.05
11:0.06
Negative Logits
DAY
-1.65
CAST
-1.41
Mass
-1.41
BAT
-1.39
Report
-1.36
abuse
-1.35
Analysis
-1.34
Posts
-1.33
Ct
-1.33
brance
-1.31
POSITIVE LOGITS
favoured
1.55
influenced
1.46
hest
1.43
iest
1.39
undermin
1.38
ibaba
1.37
hardest
1.35
strongest
1.35
heric
1.34
oriented
1.31
Activations Density 0.001%