INDEX
Explanations
phrases indicating a lack of significant change or importance
the repetition of the phrase "not much"
New Auto-Interp
Negative Logits
yne
-0.75
kus
-0.73
idon
-0.73
İĭ
-0.73
YES
-0.70
iseum
-0.67
izoph
-0.67
avering
-0.66
èĢ
-0.66
ortium
-0.66
POSITIVE LOGITS
anymore
1.10
consolation
0.87
else
0.84
avail
0.78
fuss
0.76
bothered
0.74
nor
0.73
whatsoever
0.73
dime
0.70
noticeable
0.69
Activations Density 0.057%