INDEX
Explanations
possession and following nouns
New Auto-Interp
Negative Logits
wa
0.37
ika
0.35
بين
0.35
l
0.34
보
0.33
ida
0.33
entre
0.33
ides
0.33
ists
0.32
sa
0.32
POSITIVE LOGITS
ی
0.32
ာ
0.30
า
0.30
ി
0.30
RUPTION
0.30
י
0.29
contexte
0.28
twierd
0.28
Diversity
0.28
ੇ
0.28
Activations Density 2.932%