INDEX
Explanations
phrases indicating transformation or becoming significant over time
New Auto-Interp
Head Attr Weights
0:0.01
1:0.01
2:0.08
3:0.13
4:0.16
5:0.02
6:0.11
7:0.14
8:0.03
9:0.03
10:0.09
11:0.13
Negative Logits
preferably
-1.34
intervals
-1.33
ovember
-1.27
sample
-1.26
mosqu
-1.26
beforehand
-1.25
aloud
-1.22
samples
-1.22
manually
-1.21
cafes
-1.20
POSITIVE LOGITS
rier
1.24
CEO
1.21
Tro
1.18
ernel
1.18
ettel
1.18
Claim
1.17
diver
1.17
compl
1.15
erenn
1.14
ivals
1.14
Activations Density 0.024%