INDEX
Explanations
special characters and non-standard symbols
New Auto-Interp
Head Attr Weights
0:0.06
1:0.01
2:0.12
3:0.07
4:0.05
5:0.02
6:0.24
7:0.15
8:0.04
9:0.04
10:0.07
11:0.07
Negative Logits
undown
-1.58
dictators
-1.37
aution
-1.35
Scully
-1.34
distraction
-1.32
uncond
-1.28
confir
-1.23
pestic
-1.23
conclud
-1.22
Disclosure
-1.22
POSITIVE LOGITS
�
1.70
�
1.57
Eastern
1.50
ua
1.39
ּ
1.38
�醒
1.37
�
1.36
EStream
1.35
�
1.32
�
1.30
Activations Density 0.002%