INDEX
Explanations
comparisons between different entities
phrases that compare two opposing concepts or entities, often in the format of "X vs. Y."
New Auto-Interp
Negative Logits
shire
-0.77
spot
-0.75
liner
-0.72
oola
-0.69
bean
-0.65
Topics
-0.64
estern
-0.63
Lauder
-0.61
tarian
-0.60
lied
-0.60
POSITIVE LOGITS
creen
0.76
illa
0.75
illas
0.74
pecting
0.74
ampa
0.71
.,
0.65
RHP
0.63
pect
0.62
seq
0.61
iors
0.60
Activations Density 0.017%