INDEX
Explanations
articles and conjunctions within the text
New Auto-Interp
Head Attr Weights
0:0.03
1:0.02
2:0.19
3:0.06
4:0.13
5:0.03
6:0.05
7:0.20
8:0.03
9:0.05
10:0.08
11:0.08
Negative Logits
endeavour
-1.76
endeavor
-1.73
administ
-1.72
aution
-1.68
spons
-1.66
fundament
-1.65
�
-1.63
endeavors
-1.58
procure
-1.56
conduc
-1.53
POSITIVE LOGITS
redd
2.02
itely
1.58
thouse
1.56
isodes
1.49
anomaly
1.49
oon
1.49
uggets
1.49
icons
1.46
ross
1.46
aliens
1.44
Activations Density 0.000%