INDEX
Explanations
web links or handles, specifically with multiple characters, numbers, and unusual punctuation
URLs and links
New Auto-Interp
Negative Logits
ÂŃ
-0.71
afore
-0.69
ãĢij
-0.66
caring
-0.61
corrections
-0.60
-0.60
wanting
-0.57
misrepresent
-0.57
franchise
-0.57
notebooks
-0.56
POSITIVE LOGITS
zx
1.23
zn
1.19
FK
1.16
ql
1.16
qv
1.13
Iv
1.12
YR
1.11
yx
1.11
nv
1.09
zl
1.09
Activations Density 0.056%