INDEX
Explanations
conjunctions and punctuation, particularly focusing on the word 'and' as well as various forms of commas and semicolons
New Auto-Interp
Head Attr Weights
0:0.05
1:0.15
2:0.06
3:0.07
4:0.07
5:0.05
6:0.06
7:0.21
8:0.04
9:0.07
10:0.08
11:0.04
Negative Logits
Enough
-2.57
Different
-2.43
precon
-2.30
usha
-2.30
There
-2.29
du
-2.29
Orig
-2.25
Native
-2.21
Assuming
-2.17
Actually
-2.15
POSITIVE LOGITS
ilial
3.14
mercial
2.84
then
2.79
illon
2.73
Shib
2.72
odied
2.64
later
2.61
second
2.55
Tackle
2.55
yles
2.52
Activations Density 0.001%