INDEX
Explanations
statements indicating approval or positive evaluation of various ideas or situations
New Auto-Interp
Negative Logits
eds
-0.88
doms
-0.87
lees
-0.80
ancies
-0.79
stals
-0.78
events
-0.78
gemony
-0.76
RAW
-0.76
onduct
-0.74
isms
-0.74
POSITIVE LOGITS
idea
1.19
example
1.18
indicator
1.16
asset
1.15
reminder
1.15
thing
1.14
indication
1.14
opportunity
1.11
predictor
1.10
approximation
1.10
Activations Density 0.081%