INDEX
Explanations
expressions related to personal interactions and actions
elements related to interactions and responses in conversations
New Auto-Interp
Negative Logits
xtap
-0.69
etheless
-0.65
iosyncr
-0.58
notably
-0.58
rupal
-0.57
20439
-0.56
eatures
-0.55
corrective
-0.55
ancies
-0.55
knowledgeable
-0.54
POSITIVE LOGITS
!"
1.54
!".
1.39
!'
1.36
!",
1.35
!'"
1.33
!!"
1.31
â̦"
1.29
?!"
1.28
?"
1.26
â̦"
1.22
Activations Density 0.644%