INDEX
Explanations
instances of contrasting or contrasting ideas
the word "while" and its variations to indicate contrasting scenarios or conditions
New Auto-Interp
Negative Logits
ilet
-0.73
Lay
-0.66
orthy
-0.65
bard
-0.64
atari
-0.64
INS
-0.63
aeda
-0.63
inated
-0.63
idy
-0.62
ANN
-0.62
POSITIVE LOGITS
acknowledging
0.94
technically
0.80
respecting
0.80
conced
0.77
researching
0.74
admitting
0.74
imperfect
0.72
terness
0.67
shading
0.67
lihood
0.66
Activations Density 0.070%