INDEX
Explanations
relationships and interactions among people, particularly in social and emotional contexts
New Auto-Interp
Negative Logits
aling
-0.16
818
-0.16
IBE
-0.15
elling
-0.15
\Migrations
-0.15
alar
-0.15
rios
-0.14
iero
-0.14
uje
-0.14
responsible
-0.14
POSITIVE LOGITS
being
0.73
being
0.57
Being
0.49
having
0.45
Being
0.45
sendo
0.44
becoming
0.40
被
0.38
siendo
0.37
-being
0.33
Activations Density 0.272%