INDEX
Explanations
phrases indicating a progression or transition from one state to another
phrases indicating progressive actions or processes
New Auto-Interp
Negative Logits
ids
-0.72
juven
-0.69
immortal
-0.65
pur
-0.65
blot
-0.64
rug
-0.63
sort
-0.62
stand
-0.62
quake
-0.61
mistaken
-0.59
POSITIVE LOGITS
Through
1.02
Through
0.99
edIn
0.81
through
0.81
ategory
0.77
clair
0.77
thru
0.74
Collider
0.72
cape
0.72
ensibly
0.72
Activations Density 0.007%