INDEX
Explanations
instances of the prefix "Se" or similar variations, likely indicating a focus on specific named entities or categories
New Auto-Interp
Negative Logits
ingham
-0.17
loc
-0.17
sec
-0.17
thers
-0.17
ser
-0.16
ì§Ŀ
-0.15
lob
-0.15
lla
-0.15
ovy
-0.15
log
-0.15
POSITIVE LOGITS
bastian
0.26
bast
0.23
vere
0.22
ATTLE
0.22
attle
0.21
amus
0.21
infeld
0.21
attles
0.19
jour
0.19
ismic
0.18
Activations Density 0.024%