INDEX
Explanations
terms related to social and political issues, particularly in a Danish context
New Auto-Interp
Negative Logits
rete
-0.17
iting
-0.17
hek
-0.15
Ost
-0.15
ilt
-0.14
irit
-0.14
enberg
-0.14
amp
-0.14
äge
-0.14
ammu
-0.14
POSITIVE LOGITS
er
0.38
har
0.35
kan
0.24
har
0.24
.er
0.23
HAR
0.22
_er
0.22
ER
0.21
Er
0.21
Har
0.21
Activations Density 0.020%