INDEX
Explanations
auxiliary verbs followed by verbs
New Auto-Interp
Negative Logits
yourself
0.96
yourself
0.83
Yourself
0.82
نفسك
0.70
jezelf
0.69
hebt
0.63
あなたは
0.60
رکھتا
0.59
sozinho
0.59
நீங்கள்
0.58
POSITIVE LOGITS
themselves
1.71
flock
1.06
mselves
0.93
纷纷
0.91
தங்கள்
0.91
leurs
0.90
ойношот
0.86
flocked
0.85
తమ
0.83
ihre
0.82
Activations Density 0.047%