INDEX
Explanations
phrases related to contrasts or opposition
phrases related to contrasting or conflicting ideas
New Auto-Interp
Negative Logits
exit
-0.65
flirt
-0.64
Codec
-0.64
resort
-0.63
Burnett
-0.63
Dickinson
-0.62
indemn
-0.62
troop
-0.60
desp
-0.60
emoji
-0.60
POSITIVE LOGITS
sized
1.21
based
1.20
turned
1.14
style
1.13
sama
1.09
inspired
1.06
themed
1.06
driven
1.05
type
1.05
worthy
1.05
Activations Density 0.175%