INDEX
Explanations
words related to drama and dramatic narratives
New Auto-Interp
Negative Logits
iors
-0.20
aires
-0.16
::$
-0.15
iw
-0.15
eil
-0.15
emodel
-0.15
orate
-0.15
ior
-0.15
ibling
-0.15
iais
-0.14
POSITIVE LOGITS
atic
0.26
atically
0.22
buie
0.21
tic
0.20
ATIC
0.19
atics
0.19
atur
0.19
matic
0.18
atis
0.18
ming
0.17
Activations Density 0.007%