INDEX
Explanations
instances of the word "ask" and its variations
New Auto-Interp
Negative Logits
luaj
-0.73
EStreamFrame
-0.66
lishes
-0.66
Devices
-0.66
audi
-0.65
swing
-0.65
Cheong
-0.64
adj
-0.63
capital
-0.62
gins
-0.60
POSITIVE LOGITS
politely
1.06
permission
0.95
rhet
0.88
questions
0.87
naires
0.87
forgiveness
0.84
him
0.84
nicely
0.82
probing
0.82
answered
0.78
Activations Density 0.035%