INDEX
Explanations
expressions of dedication or commitment
phrases expressing commitment to specific causes or values
New Auto-Interp
Negative Logits
adish
-0.88
ENA
-0.68
AMA
-0.64
Cheong
-0.62
oa
-0.60
ramid
-0.60
oshop
-0.59
ahi
-0.59
uman
-0.59
ena
-0.59
POSITIVE LOGITS
onite
0.78
unic
0.78
expr
0.78
adherent
0.72
targ
0.71
strong
0.71
duty
0.70
LY
0.69
enough
0.69
lly
0.69
Activations Density 0.059%