INDEX
Explanations
high-frequency common words and references in technical contexts
New Auto-Interp
Negative Logits
isty
-0.16
olic
-0.15
abe
-0.15
mania
-0.15
esen
-0.14
erged
-0.14
Bills
-0.14
ekler
-0.14
kre
-0.14
neger
-0.14
POSITIVE LOGITS
CSI
0.15
ounce
0.15
burner
0.14
isoft
0.14
hol
0.14
aper
0.14
Yoshi
0.14
awah
0.14
prise
0.14
bite
0.14
Activations Density 0.003%