INDEX
Explanations
references to questions and requests for information, particularly in formal contexts
Text before "please"
requests with please
New Auto-Interp
Negative Logits
mine
-0.54
+][
-0.51
IDK
-0.49
lains
-0.48
ηγ
-0.48
getSource
-0.48
mine
-0.47
ないけど
-0.46
mia
-0.45
gotta
-0.45
POSITIVE LOGITS
please
2.21
Please
2.03
Please
1.96
please
1.92
PLEASE
1.76
PLEASE
1.58
pls
1.39
bitte
1.39
veuillez
1.36
Kindly
1.33
Activations Density 0.236%