INDEX
Explanations
pairs of entities or people that are mentioned together in a document
New Auto-Interp
Negative Logits
only
-0.73
ventory
-0.73
reth
-0.73
spection
-0.71
livion
-0.70
=-=-
-0.70
etter
-0.69
alion
-0.68
dad
-0.68
Pastebin
-0.68
POSITIVE LOGITS
equally
0.94
excellent
0.89
vying
0.88
sexes
0.87
extremes
0.84
strengths
0.83
trademarks
0.82
positives
0.81
examples
0.80
ocating
0.78
Activations Density 0.133%