INDEX
Explanations
following articles, nouns, or prepositions
New Auto-Interp
Negative Logits
Что
0.86
ሖ
0.83
importanza
0.79
percayaan
0.75
อะไร
0.72
важней
0.72
ipotent
0.71
барои
0.70
ფუნქ
0.70
植物
0.69
POSITIVE LOGITS
also
1.34
también
1.27
Also
1.23
also
1.18
também
1.15
également
1.12
other
1.05
diğer
0.99
anche
0.98
two
0.96
Activations Density 0.004%