INDEX
Explanations
references to actors and their performances
New Auto-Interp
Negative Logits
ynom
-0.16
engo
-0.15
.cg
-0.15
erse
-0.15
semb
-0.15
(TM
-0.14
visions
-0.14
.xz
-0.14
082
-0.13
sÃŃ
-0.13
POSITIVE LOGITS
Starr
0.23
st
0.22
essays
0.21
-st
0.20
opposite
0.20
rop
0.20
essay
0.20
gro
0.19
groove
0.19
Katrina
0.19
Activations Density 0.018%