INDEX
Explanations
phrases related to comparisons or contrasts between different entities
New Auto-Interp
Negative Logits
upon
-0.80
ocate
-0.69
ritch
-0.68
atron
-0.65
ties
-0.64
ibur
-0.63
lessly
-0.61
̶
-0.61
aurus
-0.60
lance
-0.60
POSITIVE LOGITS
occasions
1.51
basis
1.40
occasion
1.37
behalf
1.33
doorstep
1.26
grounds
1.18
verge
1.15
eve
1.15
sidelines
1.14
heels
1.10
Activations Density 1.730%