INDEX
Explanations
instances of significant physical interactions and emotional responses
New Auto-Interp
Negative Logits
Yourself
-0.44
yourself
-0.42
yourselves
-0.41
himself
-0.39
myself
-0.37
Himself
-0.36
ourselves
-0.36
herself
-0.33
themselves
-0.32
him
-0.28
POSITIVE LOGITS
seu
0.47
sua
0.42
his
0.42
seus
0.39
suas
0.37
suo
0.35
her
0.34
their
0.32
她çļĦ
0.31
seine
0.31
Activations Density 0.119%