INDEX
Explanations
phrases emphasizing contrast or exception
phrases that emphasize the concept of negation and contrast
New Auto-Interp
Negative Logits
nect
-0.60
Grimoire
-0.60
Oath
-0.56
Pant
-0.55
Phill
-0.55
Dip
-0.54
timer
-0.53
Lover
-0.52
borough
-0.51
FINE
-0.51
POSITIVE LOGITS
only
1.48
merely
1.20
just
1.18
icably
1.17
just
1.13
only
1.07
JUST
1.06
ONLY
1.03
withstanding
0.96
ches
0.95
Activations Density 0.102%