INDEX
Explanations
interrogative words related to identity and location
New Auto-Interp
Negative Logits
artifacts
-0.67
astern
-0.64
Interstitial
-0.63
amy
-0.57
waves
-0.56
WATCHED
-0.55
é¾įåĸļ士
-0.55
suscept
-0.54
validity
-0.53
resid
-0.53
POSITIVE LOGITS
uer
0.73
they
0.67
lled
0.66
itiveness
0.65
THEY
0.64
hest
0.63
you
0.63
they
0.60
thou
0.60
usual
0.59
Activations Density 0.260%