INDEX
Explanations
references to intended or indirect communication
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.06
3:0.04
4:0.18
5:0.04
6:0.10
7:0.20
8:0.02
9:0.03
10:0.11
11:0.12
Negative Logits
resil
-1.35
itol
-1.25
ongevity
-1.22
PG
-1.17
adolesc
-1.15
EEK
-1.14
wagon
-1.14
alloween
-1.13
Rebell
-1.13
mosqu
-1.13
POSITIVE LOGITS
dict
1.35
Dictionary
1.28
nonexistent
1.27
wrongly
1.25
"…
1.25
accus
1.24
emails
1.24
"...
1.21
articles
1.20
inaccurate
1.20
Activations Density 0.013%