INDEX
Explanations
phrases related to potential disaster or chaos
metaphorical expressions related to decline or failure
New Auto-Interp
Negative Logits
[+
-0.74
TF
-0.62
SF
-0.58
OTOS
-0.58
Uz
-0.57
eg
-0.56
pelled
-0.56
Supp
-0.54
Quality
-0.54
NZ
-0.54
POSITIVE LOGITS
ometer
0.82
wagon
0.81
wark
0.72
ousel
0.71
bowl
0.70
coaster
0.69
otomy
0.69
pit
0.68
pins
0.68
vine
0.67
Activations Density 0.390%