INDEX
Explanations
phrases indicating a level of difficulty or timeliness
instances of the phrase "not too" indicating a qualitative assessment
New Auto-Interp
Negative Logits
igi
-0.77
swick
-0.76
hyde
-0.72
ERT
-0.69
ngth
-0.68
icipated
-0.67
intosh
-0.65
çļ
-0.65
Rus
-0.64
ridor
-0.63
POSITIVE LOGITS
much
0.91
fancy
0.87
flashy
0.83
bothered
0.80
noticeable
0.77
pleasant
0.76
surprising
0.76
forgiving
0.75
distracting
0.75
bother
0.75
Activations Density 0.030%