INDEX
Explanations
descriptions or mentions of feature films
references to feature films
New Auto-Interp
Negative Logits
ignt
-0.71
ALP
-0.70
Klux
-0.69
ça
-0.69
vernment
-0.68
cale
-0.67
hiba
-0.66
conflic
-0.64
alty
-0.64
女
-0.63
POSITIVE LOGITS
icle
1.14
icles
1.02
prominently
0.99
ttes
0.95
Feature
0.84
orial
0.82
aceous
0.82
film
0.82
iful
0.82
icularly
0.78
Activations Density 0.057%