INDEX
Explanations
dialogue or quotes that indicate someone's opinion or statement
New Auto-Interp
Negative Logits
Ease
-0.15
ÃĹ↵↵
-0.13
λά
-0.13
kred
-0.13
uro
-0.13
avo
-0.12
ÑĤеÑĩ
-0.12
lie
-0.12
تÙĨ
-0.12
ander
-0.12
POSITIVE LOGITS
one
0.38
said
0.22
ä¸Ģ个
0.21
má»Ļt
0.20
an
0.19
eines
0.19
a
0.19
ä¸ĢåĢĭ
0.19
longtime
0.19
ÛĮÚ©ÛĮ
0.18
Activations Density 0.039%