INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
ook
1.31
ात
1.23
ures
1.22
end
1.20
వా
1.18
م
1.15
реза
1.13
talks
1.12
es
1.10
త
1.09
POSITIVE LOGITS
светло
1.28
𝑄
1.24
feminine
1.21
𝑇
1.21
bewerken
1.21
masculine
1.20
CORRECT
1.18
Gauche
1.16
בּ
1.16
픔
1.16
Activations Density 0.000%
No Known Activations
This feature has no known activations.