INDEX
Explanations
instances of textual requests or prompts for responses
occurrences of the word "request" and its variations
New Auto-Interp
Negative Logits
lasses
-0.75
nat
-0.70
Surv
-0.68
icals
-0.66
olson
-0.65
Blacks
-0.65
Sax
-0.64
stocks
-0.64
tal
-0.63
Adventures
-0.62
POSITIVE LOGITS
requests
1.03
permission
0.96
request
0.94
request
0.88
requested
0.87
warrants
0.84
requesting
0.81
Animation
0.79
pardon
0.77
granted
0.76
Activations Density 0.029%