INDEX
Explanations
contractions and informal language indicative of dialogue
New Auto-Interp
Head Attr Weights
0:0.09
1:0.28
2:0.04
3:0.03
4:0.03
5:0.22
6:0.03
7:0.03
8:0.06
9:0.05
10:0.05
11:0.04
Negative Logits
�
-1.86
Parables
-1.78
dra
-1.78
û
-1.77
iery
-1.73
OPA
-1.73
dor
-1.71
º
-1.71
ERA
-1.70
idium
-1.69
POSITIVE LOGITS
work
2.01
worked
1.83
working
1.82
Work
1.77
works
1.65
worker
1.65
work
1.63
WORK
1.62
cook
1.58
revel
1.57
Activations Density 0.008%