INDEX
Explanations
sentences that emphasize importance or consequences
phrases and concepts associated with personal responsibility and collective implications
New Auto-Interp
Negative Logits
cms
-0.72
osterone
-0.71
etheus
-0.64
\/
-0.64
cricket
-0.62
acan
-0.60
Leviathan
-0.59
Lah
-0.58
Cheong
-0.57
iage
-0.57
POSITIVE LOGITS
qualified
0.72
unus
0.69
goodbye
0.67
unprotected
0.67
ezvous
0.66
tu
0.65
bye
0.64
inher
0.59
toast
0.58
vana
0.57
Activations Density 0.206%