INDEX
Explanations
historical figures and titles
New Auto-Interp
Negative Logits
maestros
0.86
cowboys
0.82
Heidi
0.79
equipos
0.78
percol
0.77
Kollegen
0.77
பெண்
0.76
piccoli
0.76
여성
0.74
topi
0.74
POSITIVE LOGITS
Ⅶ
0.97
II
0.91
VII
0.89
VII
0.87
henius
0.86
̕
0.86
הראשון
0.85
Saxony
0.85
Ⅵ
0.84
Ⅱ
0.84
Activations Density 0.060%