INDEX
Explanations
phrases or sentences emphasizing contrast or contradiction
phrases that express skepticism or doubt
New Auto-Interp
Negative Logits
hang
-0.84
ais
-0.72
roxy
-0.69
yne
-0.68
eki
-0.67
iem
-0.67
ItemThumbnailImage
-0.66
tein
-0.65
spin
-0.65
MAT
-0.64
POSITIVE LOGITS
ever
1.02
bother
0.83
EVER
0.82
percept
0.80
conceivable
0.80
imaginable
0.79
noticeable
0.74
distinguish
0.74
believable
0.72
bothering
0.72
Activations Density 0.047%