INDEX
Explanations
instances where two entities are being compared or contrasted
instances where pairs of subjects or entities are mentioned together
New Auto-Interp
Negative Logits
Trend
-0.73
yip
-0.71
etter
-0.70
uel
-0.67
fecture
-0.65
assembly
-0.65
bor
-0.65
Warden
-0.64
keleton
-0.63
inth
-0.63
POSITIVE LOGITS
edged
0.93
equally
0.84
ocated
0.81
ocating
0.81
respectively
0.80
sides
0.77
vying
0.75
acknowledged
0.73
denominations
0.72
ãĥīãĥ©ãĤ´ãĥ³
0.71
Activations Density 0.083%