INDEX
Explanations
instances of agreement and refusal in interpersonal interactions
New Auto-Interp
Negative Logits
é»
-0.15
opoulos
-0.15
oui
-0.15
uien
-0.14
abar
-0.14
éri
-0.14
Incident
-0.14
ingroup
-0.14
_soft
-0.13
Bolton
-0.13
POSITIVE LOGITS
asca
0.18
reply
0.17
rames
0.16
rott
0.15
replied
0.15
piler
0.15
.fn
0.15
suit
0.15
Hdr
0.14
ormsg
0.14
Activations Density 0.121%