INDEX
Explanations
proper nouns, particularly names of places and notable figures
New Auto-Interp
Negative Logits
oid
-0.18
ĭ
-0.17
abra
-0.17
esteem
-0.17
ause
-0.16
pred
-0.15
ending
-0.15
obar
-0.15
rend
-0.15
itur
-0.14
POSITIVE LOGITS
omba
0.15
losti
0.15
ắc
0.14
DataColumn
0.14
omb
0.14
DISPATCH
0.14
bucket
0.14
buckets
0.14
Contributions
0.14
rag
0.14
Activations Density 0.067%