INDEX
Explanations
words related to conflict or opposition between different entities
occurrences of the word "and."
New Auto-Interp
Negative Logits
Discuss
-0.84
oren
-0.73
Reviewed
-0.72
((
-0.72
oops
-0.71
ï¸
-0.71
................................................................
-0.71
notations
-0.71
huge
-0.70
\.
-0.70
POSITIVE LOGITS
ours
0.94
its
0.87
hers
0.87
theirs
0.87
the
0.78
actual
0.78
those
0.77
yours
0.75
their
0.68
non
0.67
Activations Density 0.141%