INDEX
Explanations
the significance of abstract nouns
New Auto-Interp
Negative Logits
m
1.21
.
1.03
apaixon
0.90
others
0.88
dimers
0.88
cufflinks
0.87
spouse
0.82
haupt
0.81
ING
0.80
sauerkraut
0.80
POSITIVE LOGITS
ación
1.06
zione
1.02
丆
1.02
кількість
1.01
ອ
1.00
ínio
0.98
quantidade
0.97
ittäin
0.96
èle
0.96
icación
0.96
Activations Density 0.424%