INDEX
Explanations
references to dignity or dignified behavior
references to dignity and respectful treatment
New Auto-Interp
Negative Logits
Kindle
-0.74
Wonderland
-0.70
Archangel
-0.68
Reloaded
-0.66
thinly
-0.66
è¦ļéĨĴ
-0.65
Heard
-0.65
Angry
-0.63
Spear
-0.63
Sapp
-0.62
POSITIVE LOGITS
dign
1.46
itary
1.32
ified
1.18
itas
1.05
Dign
0.93
uci
0.92
uitous
0.90
citiz
0.87
idad
0.87
honoured
0.86
Activations Density 0.002%