INDEX
Explanations
entities related to leadership and authority figures
New Auto-Interp
Negative Logits
alter
-0.17
ushman
-0.17
geç
-0.15
.nlm
-0.15
enge
-0.15
elter
-0.15
tery
-0.14
ITER
-0.14
Alter
-0.14
nice
-0.14
POSITIVE LOGITS
oli
0.17
æĸĹ
0.15
½
0.15
anium
0.15
Lo
0.15
ritten
0.14
á»ķ
0.14
lein
0.14
olo
0.14
oria
0.14
Activations Density 0.026%