INDEX
Explanations
pronoun followed by past tense verb
New Auto-Interp
Negative Logits
만
0.36
จน
0.34
م
0.34
DETAILS
0.32
BAB
0.32
obscures
0.30
If
0.30
いつ
0.30
输入
0.30
X
0.30
POSITIVE LOGITS
hatte
0.42
had
0.41
aveva
0.40
thanked
0.37
saddened
0.36
reluctantly
0.36
apologized
0.35
hadde
0.35
hesitated
0.35
congratulated
0.34
Activations Density 0.112%