INDEX
Explanations
instances of mixed emotional responses or opinions
New Auto-Interp
Negative Logits
odata
-0.15
arian
-0.15
Ïģιά
-0.15
ácil
-0.14
umed
-0.14
udden
-0.14
Demir
-0.14
ÑĢин
-0.13
.study
-0.13
Canter
-0.13
POSITIVE LOGITS
tures
0.18
dok
0.16
ë§IJ
0.15
stands
0.15
cul
0.15
Lords
0.14
_context
0.14
ep
0.14
ObjectContext
0.14
anye
0.13
Activations Density 0.002%