INDEX
Explanations
his pronouns in multiple languages
New Auto-Interp
Negative Logits
yourselves
0.51
configuration
0.49
if
0.49
suboptimal
0.49
we
0.48
easiest
0.47
nếu
0.47
bitmaps
0.47
guesswork
0.47
vapors
0.47
POSITIVE LOGITS
hänen
0.71
jego
0.70
njegov
0.70
njegove
0.70
佢
0.70
jeho
0.68
kanyang
0.67
njeg
0.66
తన
0.66
его
0.65
Activations Density 0.031%