INDEX
Explanations
phrases indicating a high level of importance or value to something
phrases indicating something significant is at risk or in jeopardy
New Auto-Interp
Negative Logits
nces
-0.76
selves
-0.71
marine
-0.67
ments
-0.64
comed
-0.62
cert
-0.62
period
-0.62
ĺ
-0.61
conditioned
-0.60
bite
-0.59
POSITIVE LOGITS
stake
1.43
onement
1.08
hand
1.00
least
0.93
heart
0.86
risk
0.84
yp
0.83
odds
0.82
abase
0.81
play
0.79
Activations Density 0.071%