INDEX
Explanations
connections between sentiment and personal experiences
New Auto-Interp
Negative Logits
urgent
-0.16
ANTE
-0.15
еÑĢк
-0.15
refix
-0.15
ulo
-0.14
wr
-0.14
idel
-0.14
sto
-0.14
wr
-0.14
rá
-0.14
POSITIVE LOGITS
ours
0.27
hers
0.22
mine
0.22
yours
0.20
mine
0.20
obvykle
0.18
ours
0.18
ìĿ´ë²Ī
0.17
501
0.17
Mine
0.17
Activations Density 0.330%