INDEX
Explanations
phrases that indicate substantial impacts, consequences, or transformations
New Auto-Interp
Head Attr Weights
0:0.03
1:0.05
2:0.12
3:0.03
4:0.02
5:0.05
6:0.08
7:0.04
8:0.25
9:0.14
10:0.08
11:0.07
Negative Logits
76561
-1.21
cffffcc
-1.12
Connector
-1.09
Holy
-1.08
Oops
-0.99
inus
-0.98
Was
-0.97
Didn
-0.97
quart
-0.96
wore
-0.95
POSITIVE LOGITS
tomorrow
1.51
hereafter
1.47
morrow
1.35
someday
1.30
ezvous
1.22
continue
1.18
continue
1.16
vale
1.16
ugi
1.15
ample
1.11
Activations Density 0.407%