INDEX
Explanations
phrases related to strong emotional or moral convictions
concepts related to commitment and desire for change
New Auto-Interp
Negative Logits
tics
-0.82
itudes
-0.77
establishments
-0.72
results
-0.69
aunts
-0.69
passages
-0.69
drops
-0.68
features
-0.68
gestures
-0.67
assets
-0.67
POSITIVE LOGITS
atical
1.02
less
0.85
akin
0.84
standpoint
0.79
acious
0.77
whereby
0.75
cipled
0.74
analogous
0.73
ful
0.73
reminiscent
0.72
Activations Density 0.377%