INDEX
Explanations
comparisons or similarities between different entities
comparative phrases indicating a level of similarity or difference
New Auto-Interp
Negative Logits
eri
-0.69
catalyst
-0.65
eni
-0.63
auri
-0.59
inho
-0.59
lla
-0.57
gnu
-0.57
onomy
-0.57
)].
-0.56
opped
-0.56
POSITIVE LOGITS
anything
0.74
any
0.73
elsewhere
0.70
anywhere
0.70
anybody
0.69
vice
0.66
conn
0.65
those
0.63
insofar
0.62
anyone
0.62
Activations Density 0.139%