INDEX
Explanations
references to death and its associated concepts
New Auto-Interp
Negative Logits
ÙĮ
-0.27
تاÙĨ
-0.20
алÑĮнаÑı
-0.18
ÑģкаÑı
-0.18
deutsche
-0.18
ÑĪаÑı
-0.17
ichtet
-0.17
ischer
-0.17
loses
-0.17
Ùı
-0.17
POSITIVE LOGITS
ischen
0.42
lichen
0.33
genden
0.31
kleinen
0.29
enden
0.27
igen
0.27
uellen
0.26
utschen
0.25
groÃŁen
0.24
ierten
0.24
Activations Density 0.049%