INDEX
Explanations
instances of specific names and references related to cultural or historical subjects
New Auto-Interp
Negative Logits
erotiske
-0.16
Ñij
-0.16
era
-0.16
lings
-0.15
Ãł
-0.15
äng
-0.15
angen
-0.15
soever
-0.15
geschichten
-0.15
led
-0.15
POSITIVE LOGITS
sz
0.24
egy
0.24
agy
0.21
Sz
0.21
harm
0.20
gy
0.20
legs
0.20
szer
0.19
ny
0.19
sz
0.18
Activations Density 0.011%