INDEX
Explanations
mentions of actors and related terms
references to actors
New Auto-Interp
Negative Logits
£ı
-0.74
vironment
-0.72
asonable
-0.68
tera
-0.67
plet
-0.66
yss
-0.65
ensable
-0.64
elt
-0.63
ruciating
-0.62
elsius
-0.62
POSITIVE LOGITS
actor
1.00
rities
0.96
actress
0.88
acters
0.83
writers
0.80
actors
0.79
Actor
0.77
rano
0.75
Actor
0.75
plays
0.75
Activations Density 0.021%