INDEX
Explanations
instances of conflict or tension in relationships
New Auto-Interp
Negative Logits
бÑĭло
-0.19
بÙĪØ¯ÙĨد
-0.18
Were
-0.17
бÑĭл
-0.17
Were
-0.17
бÑĭли
-0.17
بÙĪØ¯
-0.16
byÅĤo
-0.16
бÑĭла
-0.15
peats
-0.15
POSITIVE LOGITS
ate
0.30
got
0.29
took
0.28
sat
0.27
drank
0.26
threw
0.26
ran
0.25
drove
0.25
got
0.24
drew
0.24
Activations Density 0.232%