INDEX
Explanations
phrases related to visual appearance or hypothetical scenarios
phrases discussing appearances or projections of how things will look in the future
New Auto-Interp
Negative Logits
cipl
-0.73
ukong
-0.72
ilts
-0.63
QUEST
-0.61
lo
-0.61
rists
-0.59
ishable
-0.58
Jes
-0.58
Bac
-0.57
conn
-0.57
POSITIVE LOGITS
nowadays
0.77
today
0.75
WITHOUT
0.73
inside
0.72
tonight
0.72
AFTER
0.71
BEFORE
0.71
outside
0.69
when
0.69
before
0.69
Activations Density 0.056%