INDEX
Explanations
references to people and their relationships within a community context
New Auto-Interp
Negative Logits
themselves
-0.18
celed
-0.15
ovÄĽ
-0.14
ÏĦοÏį
-0.14
ersistent
-0.14
of
-0.14
zte
-0.14
enting
-0.14
вÑģеÑħ
-0.14
bish
-0.13
POSITIVE LOGITS
are
0.15
acles
0.15
iei
0.14
usc
0.14
mo
0.14
insky
0.14
-même
0.14
dee
0.14
maal
0.14
772
0.13
Activations Density 0.066%