INDEX
Explanations
names of prominent political figures
New Auto-Interp
Negative Logits
Grove
-0.16
anga
-0.15
Blond
-0.14
Ã¥de
-0.14
aç
-0.14
ackle
-0.14
IFEST
-0.13
šen
-0.13
dna
-0.13
rove
-0.13
POSITIVE LOGITS
cts
0.15
blind
0.15
оÑģÑĮ
0.14
ALI
0.14
ceph
0.14
extr
0.13
assic
0.13
Tob
0.13
adece
0.13
ÑĪли
0.13
Activations Density 0.015%