INDEX
Explanations
expressing likes and dislikes
New Auto-Interp
Negative Logits
peut
1.32
deu
1.25
pesawat
1.23
pertenece
1.16
ğı
1.16
alcanza
1.12
intim
1.12
информации
1.12
resp
1.12
για
1.12
POSITIVE LOGITS
y
1.31
ي
1.30
څنګه
1.22
redditmedia
1.22
minded
1.19
𝘳
1.18
मंडल
1.16
viel
1.13
چڑھ
1.12
ۥ
1.11
Activations Density 0.424%