INDEX
Explanations
first-person pronouns and expressions of personal experience or feeling
New Auto-Interp
Negative Logits
Enlight
-0.16
enlight
-0.16
pioneered
-0.15
Typ
-0.15
pioneering
-0.15
enlightened
-0.15
elog
-0.15
quisite
-0.15
PFN
-0.14
lington
-0.14
POSITIVE LOGITS
wanted
0.20
rew
0.19
wanted
0.18
drew
0.16
Wanted
0.16
annis
0.15
unix
0.15
æ·»
0.15
ioutil
0.15
channel
0.14
Activations Density 0.135%