INDEX
Explanations
phrases related to conversations or exchanges
conversational cues and expressions of need for communication
New Auto-Interp
Negative Logits
WATCHED
-0.81
"#
-0.81
ĺħ
-0.74
utterstock
-0.72
Article
-0.72
xtap
-0.70
NAS
-0.69
Zucker
-0.69
NFL
-0.68
����
-0.67
POSITIVE LOGITS
-"
2.00
â̦"
1.68
..."
1.64
—"
1.59
!?"
1.35
?"
1.29
?!"
1.26
â̦."
1.24
â̦"
1.22
!"
1.13
Activations Density 0.351%