INDEX
Explanations
comparisons or contrasts between different concepts or entities
comparisons and contrasts between opposing concepts
New Auto-Interp
Negative Logits
oker
-0.73
iband
-0.70
cffffcc
-0.69
eri
-0.67
istar
-0.65
atern
-0.65
aspers
-0.64
aston
-0.64
akening
-0.63
ortal
-0.63
POSITIVE LOGITS
hers
0.84
theirs
0.81
nurture
0.77
ours
0.72
damned
0.65
equals
0.64
adversity
0.64
reality
0.63
gew
0.62
dr
0.62
Activations Density 0.196%