INDEX
Explanations
difficult emotions or subjects
New Auto-Interp
Negative Logits
ότι
1.30
것
1.27
ב
1.25
difficulties
1.16
ਇੱਕ
1.15
कर
1.10
городах
1.10
𝑑
1.10
воды
1.09
드
1.09
POSITIVE LOGITS
t
1.49
is
1.26
_
1.14
-
1.10
ol
1.01
ן
1.00
us
0.94
ist
0.94
$
0.94
in
0.91
Activations Density 0.013%