INDEX
Explanations
phrases related to obstacles or starting points
keywords related to obstacles or transitional phases in narratives or arguments
New Auto-Interp
Negative Logits
ngth
-0.75
weddings
-0.71
thood
-0.68
merce
-0.67
profits
-0.63
yss
-0.63
angered
-0.63
deen
-0.63
reviews
-0.62
hotels
-0.62
POSITIVE LOGITS
lude
0.85
harb
0.83
hurdle
0.79
wark
0.77
point
0.77
starter
0.76
quel
0.75
brainer
0.75
obstacle
0.75
blow
0.74
Activations Density 0.279%