INDEX
Explanations
phrases related to starting or engaging in a conversation
mentions of conversations
New Auto-Interp
Negative Logits
rule
-0.72
cheat
-0.70
arah
-0.65
rikes
-0.65
anmar
-0.64
emale
-0.64
metics
-0.62
atever
-0.62
redit
-0.61
peria
-0.60
POSITIVE LOGITS
conversation
0.93
conversations
0.87
ogue
0.86
banter
0.83
naire
0.80
Conversation
0.77
BACK
0.77
overheard
0.76
invitations
0.75
transcripts
0.71
Activations Density 0.022%