INDEX
Explanations
phrases indicating progression or change over time
phrases indicating progression or change over time
New Auto-Interp
Negative Logits
glers
-0.63
otos
-0.56
ociate
-0.56
izons
-0.53
anwhile
-0.53
Odyssey
-0.52
76561
-0.52
feld
-0.52
respectively
-0.51
oway
-0.49
POSITIVE LOGITS
raining
0.94
impossible
0.71
easy
0.67
easier
0.62
downhill
0.60
uphill
0.59
ironic
0.58
funny
0.57
safe
0.55
obvious
0.55
Activations Density 0.599%