INDEX
Explanations
terms or phrases related to equivalence or comparison
phrases relating to equivalence or comparison
New Auto-Interp
Negative Logits
ger
-0.81
oard
-0.78
der
-0.76
ben
-0.76
stra
-0.71
Bomb
-0.70
stal
-0.69
zan
-0.66
hra
-0.65
ARD
-0.64
POSITIVE LOGITS
equivalent
0.97
ivalent
0.94
isons
0.93
lihood
0.90
equival
0.80
awei
0.79
equivalents
0.78
bnb
0.78
oppos
0.73
imately
0.73
Activations Density 0.010%