INDEX
Explanations
the action of pretending or feigning
instances of the word "pretend" and its variations
New Auto-Interp
Negative Logits
20439
-0.74
âĨij
-0.70
GOODMAN
-0.65
cutting
-0.65
ccording
-0.62
srf
-0.62
士
-0.61
palms
-0.61
sbm
-0.60
vez
-0.60
POSITIVE LOGITS
innocence
0.83
ulence
0.70
entious
0.69
forgot
0.68
pas
0.65
otherwise
0.63
pret
0.63
¯
0.62
falsely
0.62
inco
0.62
Activations Density 0.022%