INDEX
Explanations
specific Polish words and phrases related to personal experiences and emotions
New Auto-Interp
Negative Logits
kus
-0.19
kil
-0.18
kok
-0.16
okable
-0.15
buch
-0.15
è¡£
-0.15
oÄį
-0.14
νοÏį
-0.14
Ī
-0.14
ãĤ»ãĥ³
-0.14
POSITIVE LOGITS
ж
0.55
ž
0.53
ż
0.52
жа
0.43
ży
0.42
жи
0.41
Ðĸ
0.41
же
0.41
ži
0.41
žÃŃ
0.41
Activations Density 0.024%