INDEX
Explanations
instances of the word "split" and its variations
New Auto-Interp
Negative Logits
het
-0.17
474
-0.17
ensen
-0.17
iki
-0.15
uest
-0.14
irected
-0.14
ermen
-0.14
WS
-0.14
704
-0.14
sl
-0.14
POSITIVE LOGITS
ting
0.34
tered
0.28
TING
0.24
tering
0.23
ter
0.23
apart
0.21
(split
0.20
tring
0.20
TINGS
0.20
.Split
0.20
Activations Density 0.023%