INDEX
Explanations
possessive nouns and associated concepts
New Auto-Interp
Negative Logits
ת
1.05
មនុស្ស
1.03
ં
1.02
ંને
1.02
guys
0.98
അതിന്റെ
0.97
ީ
0.96
its
0.95
मर्दों
0.93
മനുഷ്യ
0.92
POSITIVE LOGITS
who
1.05
rights
1.05
którzy
1.04
kteří
0.95
dreams
0.90
sake
0.87
deres
0.87
shoes
0.86
quem
0.85
precisam
0.84
Activations Density 0.061%