INDEX
Explanations
instances of the word "pretend" in various contexts
New Auto-Interp
Negative Logits
jack
-0.17
/lg
-0.17
ëŀ
-0.17
iÄįka
-0.16
aiser
-0.16
ActionCreators
-0.15
oggler
-0.15
âĹĦ
-0.15
ÐŁÐļ
-0.15
rese
-0.15
POSITIVE LOGITS
pret
0.16
779
0.15
REP
0.14
uous
0.14
ceptive
0.14
ÑĦи
0.13
918
0.13
glich
0.13
pid
0.13
-tip
0.13
Activations Density 0.013%