INDEX
Explanations
references to operas
references to opera
New Auto-Interp
Negative Logits
animous
-0.80
erred
-0.73
adow
-0.72
idential
-0.67
later
-0.66
ĪĴ
-0.65
airy
-0.65
erry
-0.64
alien
-0.64
unch
-0.63
POSITIVE LOGITS
opera
1.13
singers
0.89
singer
0.85
otomy
0.83
glers
0.83
Opera
0.79
traged
0.77
oper
0.76
rehearsal
0.76
theatre
0.73
Activations Density 0.005%