INDEX
Explanations
phrases emphasizing commitment
instances of the word "commitment"
New Auto-Interp
Negative Logits
adish
-0.73
dos
-0.67
avis
-0.65
Caucas
-0.65
phe
-0.65
isen
-0.64
bin
-0.64
rors
-0.62
walking
-0.61
ox
-0.61
POSITIVE LOGITS
commitment
1.14
commitments
0.94
Commit
0.86
allegiance
0.83
irmation
0.82
ilitary
0.76
pledge
0.74
gence
0.74
commit
0.73
xual
0.71
Activations Density 0.014%