INDEX
Explanations
references to heroes and heroism in various contexts
New Auto-Interp
Negative Logits
roje
-0.17
serter
-0.15
VEC
-0.15
ãĤ¤ãĤ¯
-0.15
å¢ĵ
-0.15
enor
-0.15
oy
-0.15
enko
-0.14
erman
-0.14
wers
-0.14
POSITIVE LOGITS
ingles
0.17
he
0.15
ines
0.15
ing
0.15
oval
0.14
inch
0.14
ically
0.14
اÙĨÙĩ
0.14
ำ
0.14
olin
0.13
Activations Density 0.041%