INDEX
Explanations
phrases or text indicating a new starting point or beginning
phrases indicating the beginning of new events or actions
New Auto-Interp
Negative Logits
itsch
-0.83
thel
-0.71
cedented
-0.71
greg
-0.69
ingly
-0.69
ombat
-0.67
otropic
-0.66
atron
-0.65
ilit
-0.65
vertisement
-0.64
POSITIVE LOGITS
anew
0.94
salaries
0.75
XI
0.72
point
0.68
tomorrow
0.68
TODAY
0.67
ages
0.65
Point
0.62
pitcher
0.62
inference
0.61
Activations Density 0.056%