INDEX
Explanations
questions or requests directed towards someone
instances of the word "asked."
New Auto-Interp
Negative Logits
Ĥ¬
-0.83
Scouting
-0.74
marine
-0.73
âĸ¬
-0.67
âĶĢâĶĢ
-0.65
cyclop
-0.60
\\\\
-0.60
smoking
-0.60
mit
-0.60
ordinate
-0.60
POSITIVE LOGITS
rhet
1.18
questions
1.03
probing
0.98
politely
0.89
him
0.87
forgiveness
0.85
asked
0.85
naires
0.84
ioned
0.82
permission
0.81
Activations Density 0.044%