INDEX
Explanations
pronoun after dialogue punctuation
New Auto-Interp
Negative Logits
focuses
0.53
:"
0.50
".
0.50
->
0.50
:"
0.50
".
0.50
'$
0.49
...")
0.48
retains
0.48
--->
0.48
POSITIVE LOGITS
he
0.75
他說
0.75
他说
0.74
she
0.63
murmured
0.63
she
0.56
он
0.56
她说
0.54
katanya
0.53
came
0.49
Activations Density 0.009%