INDEX
Explanations
references to Martin Luther King Jr. and related terms
New Auto-Interp
Negative Logits
neys
-0.15
ies
-0.15
ël
-0.14
ichten
-0.14
ardi
-0.14
verity
-0.13
Mort
-0.13
wz
-0.13
jsc
-0.13
ãĥĹãĥ¬
-0.13
POSITIVE LOGITS
King
0.34
King
0.28
KING
0.26
king
0.25
Jr
0.23
king
0.23
ML
0.22
ãĤŃãĥ³ãĤ°
0.22
Luther
0.21
ML
0.18
Activations Density 0.008%