INDEX
Explanations
references to actors or actresses
references to actors and actresses
New Auto-Interp
Negative Logits
manac
-0.74
cling
-0.70
rotein
-0.68
heet
-0.67
otide
-0.66
tical
-0.64
udge
-0.63
aq
-0.63
gn
-0.62
jong
-0.61
POSITIVE LOGITS
actors
3.84
actor
2.75
actresses
2.30
Actor
1.98
performers
1.95
Actor
1.93
actress
1.86
filmmakers
1.77
singers
1.66
comedians
1.59
Activations Density 0.022%