INDEX
Explanations
phrases related to comparison or contrast
words indicating importance or significance
New Auto-Interp
Negative Logits
but
-0.66
orie
-0.65
WT
-0.63
soDeliveryDate
-0.62
ructose
-0.61
schild
-0.61
iddle
-0.61
Travels
-0.60
ukong
-0.60
But
-0.59
POSITIVE LOGITS
nonetheless
1.46
nevertheless
1.18
etheless
1.07
still
0.77
darn
0.73
anyway
0.71
awfully
0.69
strangely
0.68
overshadowed
0.67
anyways
0.66
Activations Density 1.312%