INDEX
Explanations
past actions or habits
phrases indicating past experiences or actions using the phrase "used to."
New Auto-Interp
Negative Logits
ensued
-0.68
worthiness
-0.67
ouf
-0.65
abl
-0.64
deserves
-0.63
ographies
-0.62
lip
-0.61
worthy
-0.61
icious
-0.60
icipated
-0.60
POSITIVE LOGITS
igslist
0.79
rent
0.78
shrink
0.75
hear
0.75
bombard
0.75
bury
0.74
find
0.73
advertise
0.72
form
0.72
peg
0.72
Activations Density 0.057%