INDEX
Explanations
words related to opposition or conflict
New Auto-Interp
Negative Logits
ipel
-0.69
uilt
-0.67
enegger
-0.67
ibly
-0.62
ITED
-0.61
IOR
-0.61
insured
-0.61
ollen
-0.59
iven
-0.58
affected
-0.58
POSITIVE LOGITS
heels
0.88
glove
0.74
toe
0.69
pound
0.68
knees
0.67
tails
0.67
bisc
0.66
quo
0.66
roses
0.65
oranges
0.64
Activations Density 6.529%