INDEX
Explanations
questions about processes and experiences
New Auto-Interp
Negative Logits
ContentAlignment
-0.47
kysy
-0.42
sério
-0.41
dezelve
-0.41
brancas
-0.41
econômica
-0.40
mitään
-0.39
سكانية
-0.39
kaupung
-0.39
kirja
-0.38
POSITIVE LOGITS
How
0.87
How
0.85
how
0.84
how
0.79
Cómo
0.79
cómo
0.77
HOW
0.75
howto
0.75
如何
0.71
HOW
0.69
Activations Density 0.242%