INDEX
Negative Logits
“[
-1.14
-1.05
[]
-1.00
'"+
-1.00
ְּ
-0.96
'-')
-0.96
befindet
-0.95
.$
-0.94
“[
-0.94
,@
-0.93
POSITIVE LOGITS
():
3.13
():
2.88
():
2.05
(
2.03
):
1.97
'):
1.94
):
1.91
()):
1.88
):
1.75
(
1.71
Activations Density 0.019%
“[
[]
'"+
ְּ
'-')
befindet
.$
“[
,@
():
():
():
(
):
'):
):
()):
):
(