INDEX
Explanations
expressions of personal reaction and ownership
New Auto-Interp
Negative Logits
ournals
-0.71
idelines
-0.68
Printed
-0.66
soDeliveryDate
-0.65
pub
-0.65
etary
-0.61
onto
-0.61
Dialogue
-0.61
artifacts
-0.60
chairs
-0.59
POSITIVE LOGITS
dismay
1.27
surprise
1.19
extent
1.14
aston
1.13
amaz
1.09
detriment
1.07
annoyance
1.02
delight
1.01
disappointment
0.95
puzz
0.94
Activations Density 0.052%