INDEX
Explanations
references to the term "Western" in various contexts
New Auto-Interp
Negative Logits
nel
-0.18
anton
-0.15
éĴ
-0.14
ikon
-0.14
.scalablytyped
-0.14
uk
-0.13
ndx
-0.13
uar
-0.13
atisf
-0.13
306
-0.13
POSITIVE LOGITS
most
0.30
ers
0.27
ized
0.25
ization
0.23
-most
0.23
er
0.23
s
0.21
esse
0.21
ised
0.20
isation
0.20
Activations Density 0.017%