INDEX
Explanations
references to theaters
mentions of theaters and theatrical productions
New Auto-Interp
Negative Logits
nir
-0.80
doms
-0.78
ilies
-0.73
ortium
-0.69
nesty
-0.68
hee
-0.68
pages
-0.67
vironment
-0.66
tan
-0.66
rontal
-0.65
POSITIVE LOGITS
goers
1.07
marqu
1.04
wright
0.98
theatre
0.95
theater
0.94
theaters
0.89
theat
0.88
Royale
0.85
productions
0.83
trou
0.83
Activations Density 0.030%