INDEX
Explanations
references to theaters or similar entertainment venues
references to different theater establishments and productions
New Auto-Interp
Negative Logits
doms
-0.83
imo
-0.79
luent
-0.75
agher
-0.72
nir
-0.69
nesty
-0.69
itive
-0.68
yg
-0.66
plets
-0.66
unin
-0.65
POSITIVE LOGITS
goers
1.21
marqu
1.05
productions
0.97
theatre
0.90
theater
0.88
Royale
0.87
trou
0.87
theaters
0.83
halls
0.82
wright
0.81
Activations Density 0.076%