INDEX
Explanations
phrases of praise or criticism
phrases of high praise or critical assessment
New Auto-Interp
Negative Logits
externalActionCode
-0.85
isen
-0.68
Repeat
-0.62
tails
-0.62
Exxon
-0.61
iland
-0.59
interrupted
-0.59
rief
-0.58
OTUS
-0.58
lyak
-0.57
POSITIVE LOGITS
virtues
0.81
unbeat
0.75
uilt
0.74
defic
0.70
gobl
0.68
greatness
0.67
evils
0.66
toughness
0.63
superiority
0.62
cumbers
0.61
Activations Density 0.525%