INDEX
Explanations
references to actors and their roles in various contexts
New Auto-Interp
Negative Logits
ader
-0.19
erable
-0.19
erator
-0.18
est
-0.18
ned
-0.17
seo
-0.16
394
-0.16
coming
-0.16
eration
-0.16
Acts
-0.15
POSITIVE LOGITS
uate
0.20
-direct
0.19
roles
0.19
roles
0.18
Roles
0.18
/music
0.18
uated
0.17
/model
0.17
/wait
0.16
uating
0.16
Activations Density 0.013%