INDEX
Explanations
expressions of desire or preference toward actions or objects
desire or liking
New Auto-Interp
Negative Logits
-0.34
poffible
-0.30
cris
-0.30
power
-0.29
ьажоргаш
-0.29
ientí
-0.29
wei
-0.28
Karo
-0.28
instin
-0.28
false
-0.28
POSITIVE LOGITS
surla
0.79
gostaria
0.69
gustaría
0.65
aimerais
0.57
AddWithValue
0.56
quisiera
0.50
expandindo
0.49
jalá
0.49
EconPapers
0.49
وفاته
0.48
Activations Density 0.055%