INDEX
Explanations
references to specific television and film roles, particularly those involving popular characters and actors
New Auto-Interp
Head Attr Weights
0:0.05
1:0.02
2:0.03
3:0.05
4:0.09
5:0.18
6:0.03
7:0.05
8:0.04
9:0.30
10:0.03
11:0.08
Negative Logits
══
-2.89
Their
-2.84
exceeds
-2.81
�
-2.68
smanship
-2.59
They
-2.54
exceed
-2.51
rehens
-2.50
THEY
-2.47
THEIR
-2.44
POSITIVE LOGITS
narrator
3.15
antagonist
2.57
agonist
2.53
roo
2.49
Guest
2.49
bane
2.47
angelo
2.42
asher
2.40
villain
2.40
aboard
2.39
Activations Density 0.081%