INDEX
Explanations
statements expressing skepticism or critical reflections about established beliefs
New Auto-Interp
Head Attr Weights
0:0.04
1:0.03
2:0.11
3:0.07
4:0.17
5:0.04
6:0.06
7:0.19
8:0.06
9:0.04
10:0.05
11:0.08
Negative Logits
aples
-1.63
scope
-1.55
�
-1.53
indefinitely
-1.51
except
-1.50
sov
-1.49
Pry
-1.48
anyways
-1.48
worldly
-1.47
Lavrov
-1.47
POSITIVE LOGITS
�
1.57
vigilant
1.48
GoldMagikarp
1.47
ldom
1.46
Flavoring
1.46
disob
1.45
okemon
1.39
�
1.33
�
1.32
Breed
1.32
Activations Density 0.000%