INDEX
Explanations
instances of interruptions and conversational dynamics in dialogue
New Auto-Interp
Negative Logits
atr
-0.16
èĢĥ
-0.15
shaw
-0.15
oms
-0.15
oto
-0.14
led
-0.14
ibold
-0.14
orex
-0.14
llum
-0.14
missions
-0.13
POSITIVE LOGITS
INET
0.16
inet
0.14
ertype
0.14
eker
0.14
hec
0.14
piler
0.14
Kemp
0.14
ãĥĥãĥĦ
0.14
anford
0.14
VRT
0.13
Activations Density 0.311%