INDEX
Explanations
end of sentence punctuation followed by pronoun
New Auto-Interp
Negative Logits
↵↵
0.52
other
0.48
ad
0.43
}
0.43
ic
0.42
ed
0.41
res
0.41
int
0.41
и
0.38
class
0.38
POSITIVE LOGITS
Tämä
0.73
이건
0.65
Acest
0.65
이는
0.60
Đây
0.60
これは
0.60
وهذا
0.59
Özellikle
0.59
Diese
0.59
Ensures
0.58
Activations Density 5.083%