INDEX
Explanations
responses and replies in a conversational context
instances of the word "replied" indicating responses or statements made by individuals
New Auto-Interp
Negative Logits
teenth
-0.71
BALL
-0.69
enburg
-0.67
dar
-0.67
fi
-0.65
bound
-0.65
Blazers
-0.65
icipated
-0.64
mental
-0.64
Trials
-0.63
POSITIVE LOGITS
thereto
0.99
affirm
0.96
sarcast
0.96
reply
0.94
angrily
0.91
favorably
0.86
later
0.83
enthusiastically
0.83
politely
0.82
promptly
0.80
Activations Density 0.038%