INDEX
Explanations
terms related to artificiality and deception
"Fake" or "artificial" preceding nouns
fake or artificial things
New Auto-Interp
Negative Logits
{\-0.51
pribadi
-0.49
ಂತ
-0.48
usiai
-0.47
înal
-0.47
oamen
-0.46
BoxFit
-0.46
usercontent
-0.46
tamment
-0.46
reactivex
-0.45
POSITIVE LOGITS
LookAnd
0.90
Fake
0.80
fake
0.73
Fake
0.71
pretending
0.71
sembl
0.71
pretence
0.70
pretends
0.69
Mock
0.69
pretend
0.69
Activations Density 0.225%