INDEX
Explanations
references to conditions and relationships involving entities and attributes
New Auto-Interp
Negative Logits
ITS
-0.56
Its
-0.47
ITS
-0.47
Its
-0.47
OWA
-0.45
its
-0.44
холо
-0.42
dl
-0.41
Fils
-0.41
اش
-0.40
POSITIVE LOGITS
themselves
1.24
themselves
1.24
yourselves
0.92
their
0.91
they
0.89
Their
0.89
their
0.85
Their
0.83
którzy
0.81
彼らの
0.81
Activations Density 0.746%