INDEX
Explanations
information about person appearances in a film
New Auto-Interp
Negative Logits
ustomed
-0.78
án
-0.77
âĢij
-0.76
20439
-0.74
obil
-0.74
bitious
-0.74
etermined
-0.73
ornings
-0.72
ÂŃ
-0.71
ensable
-0.70
POSITIVE LOGITS
kinda
1.08
anyways
1.05
shitty
1.02
lol
1.00
devs
0.99
stupidity
0.98
fucking
0.95
bullshit
0.94
idiots
0.93
stupid
0.93
Activations Density 1.482%