INDEX
Explanations
instances of "Se" or related variations
New Auto-Interp
Negative Logits
sec
-0.18
ingham
-0.17
loc
-0.17
lla
-0.17
se
-0.16
ser
-0.16
sect
-0.15
ovy
-0.15
047
-0.15
die
-0.15
POSITIVE LOGITS
amus
0.24
bastian
0.24
bast
0.22
attles
0.22
vere
0.21
aside
0.20
infeld
0.20
ATTLE
0.20
attle
0.20
ismic
0.18
Activations Density 0.022%