INDEX
Explanations
phrases indicating outcomes or revelations
phrases indicating situations that evolve or reveal themselves over time
New Auto-Interp
Negative Logits
achus
-0.70
cham
-0.68
afia
-0.66
eatures
-0.66
brow
-0.66
panel
-0.66
Previous
-0.65
CW
-0.64
antha
-0.64
colo
-0.64
POSITIVE LOGITS
quite
0.76
REALLY
0.73
pretty
0.70
really
0.70
ozy
0.69
MUCH
0.68
surprisingly
0.66
reversed
0.66
remarkably
0.64
disastrous
0.64
Activations Density 0.100%