INDEX
Explanations
mentions of Korea or related terms
Korea or Korean Peninsula
New Auto-Interp
Negative Logits
Wim
-0.44
!
-0.43
Cla
-0.40
Bus
-0.40
invo
-0.39
sif
-0.38
cla
-0.38
sal
-0.38
Wim
-0.38
Jill
-0.38
POSITIVE LOGITS
Korea
2.14
Korea
2.02
korea
1.74
Korean
1.61
Koreans
1.56
korea
1.51
Korean
1.46
Corée
1.44
OREA
1.37
Corea
1.34
Activations Density 0.004%