INDEX
Explanations
phrases related to prevention
phrases related to preventing negative outcomes or situations
New Auto-Interp
Negative Logits
ammy
-0.92
geist
-0.76
enegger
-0.72
elt
-0.69
stand
-0.68
etics
-0.67
edded
-0.66
eah
-0.66
bold
-0.65
night
-0.63
POSITIVE LOGITS
ative
0.95
detection
0.80
regress
0.80
ively
0.77
pregnancies
0.77
duplicate
0.75
accidental
0.74
bothering
0.74
pregnancy
0.72
obstruct
0.72
Activations Density 0.043%