INDEX
Explanations
occurrences of the word "in" at various frequencies
New Auto-Interp
Negative Logits
spite
-0.20
sofar
-0.18
омеÑĢ
-0.15
leans
-0.15
971
-0.14
/about
-0.14
ypical
-0.14
iod
-0.13
lef
-0.13
¹
-0.13
POSITIVE LOGITS
short
0.45
other
0.39
essence
0.36
short
0.35
lay
0.35
simple
0.34
simpler
0.33
-short
0.32
other
0.30
ngắn
0.30
Activations Density 0.090%