INDEX
Explanations
mentions of individual people and their interactions
New Auto-Interp
Negative Logits
ipeg
-0.17
éĹ
-0.16
anth
-0.16
æij
-0.16
HIR
-0.15
ssel
-0.15
oref
-0.14
ç¾Ĭ
-0.14
ké
-0.14
çĻ»
-0.14
POSITIVE LOGITS
ãĥ¡ãĥ¼ãĤ¸
0.16
ARIO
0.15
arios
0.15
vg
0.15
Ont
0.14
riet
0.14
reno
0.14
resign
0.14
éª
0.14
ocale
0.14
Activations Density 0.121%