INDEX
Explanations
phrases related to actors starring in various movies and shows
the presence of phrases indicating roles in film or television
New Auto-Interp
Negative Logits
learners
-0.83
llor
-0.77
convol
-0.76
incumb
-0.74
incl
-0.67
deterrent
-0.65
lear
-0.63
pec
-0.62
taught
-0.62
awei
-0.62
POSITIVE LOGITS
lieu
1.22
spite
1.08
animate
1.06
clus
1.04
conjunction
1.04
disguise
1.03
accordance
0.99
vain
0.99
strument
0.98
versions
0.97
Activations Density 0.519%