INDEX
Explanations
words related to strong emotions or impact
key emotional or impactful moments in narratives
New Auto-Interp
Negative Logits
orthy
-0.70
Yose
-0.68
Pry
-0.65
begin
-0.64
CAST
-0.63
etics
-0.62
thumbnails
-0.61
Published
-0.61
Guest
-0.60
atre
-0.60
POSITIVE LOGITS
alone
0.87
ioned
0.78
rame
0.73
milo
0.70
oggle
0.68
pesky
0.68
lasted
0.67
aton
0.67
:(
0.66
mattered
0.66
Activations Density 0.297%