INDEX
Explanations
phrases related to imitation or imitation products
references to artificiality or imitation in various contexts
New Auto-Interp
Negative Logits
edient
-0.73
ository
-0.73
llah
-0.73
omb
-0.71
ovo
-0.70
essa
-0.70
onent
-0.70
enance
-0.69
itute
-0.69
Hoffman
-0.68
POSITIVE LOGITS
pas
1.14
faux
0.99
pas
0.90
hawk
0.81
lion
0.74
cus
0.72
cin
0.70
lam
0.70
fur
0.68
tan
0.68
Activations Density 0.011%