INDEX
Explanations
mentions of specific individuals and their placeholder pages
New Auto-Interp
Negative Logits
ä¸Ī
-0.16
nia
-0.15
-mouth
-0.15
emoc
-0.15
zych
-0.15
rai
-0.14
zzo
-0.14
.tt
-0.14
ream
-0.14
arte
-0.14
POSITIVE LOGITS
/name
0.16
profile
0.15
class
0.15
uant
0.15
ss
0.15
et
0.14
name
0.14
topp
0.14
Profile
0.14
695
0.14
Activations Density 0.005%