INDEX
Explanations
feeling intense emotions or obligation
New Auto-Interp
Negative Logits
przeciw
0.28
Trying
0.27
Fakt
0.26
Ав
0.25
Being
0.25
ሎ
0.25
ulosa
0.25
Damages
0.24
Pe
0.24
izers
0.24
POSITIVE LOGITS
intimidated
0.48
obliged
0.44
compelled
0.44
guilty
0.44
outnumbered
0.44
tempted
0.44
drawn
0.42
ridiculous
0.41
obligated
0.41
threatened
0.40
Activations Density 0.020%