INDEX
Explanations
acknowledgments and expressions of gratitude
New Auto-Interp
Negative Logits
יוחד
-0.49
reds
-0.47
tso
-0.45
ENARIO
-0.45
付け
-0.44
dymyr
-0.44
ÍST
-0.44
arthed
-0.43
ensed
-0.42
ใคร
-0.42
POSITIVE LOGITS
being
1.25
being
1.10
having
1.06
having
1.02
étant
1.01
Being
0.96
haberse
0.92
habiendo
0.90
esser
0.89
BEING
0.88
Activations Density 0.174%