INDEX
Explanations
mentions of celebrity relationships and personal stories
New Auto-Interp
Negative Logits
ãĥ¼ãĤ¿ãĥ¼
-0.15
Verm
-0.14
oram
-0.14
kola
-0.14
gaard
-0.14
Ñĩки
-0.14
ãĢľ
-0.14
eldon
-0.14
елеÑĦ
-0.13
olars
-0.13
POSITIVE LOGITS
Taylor
0.24
Offset
0.23
Drake
0.22
Ade
0.21
Card
0.20
Offset
0.20
Bey
0.20
Lady
0.20
Ari
0.20
Ci
0.20
Activations Density 0.126%