INDEX
Explanations
references to deception or imitation in a context involving music or art
New Auto-Interp
Negative Logits
linger
-0.17
ushman
-0.16
Unchecked
-0.15
TexParameter
-0.15
ãĥ¼ãĥį
-0.14
ká»
-0.14
achts
-0.13
/Foundation
-0.13
icho
-0.13
á»ijc
-0.13
POSITIVE LOGITS
fake
0.69
Fake
0.57
fake
0.56
Fake
0.54
false
0.50
faker
0.47
false
0.41
fals
0.41
(fake
0.40
False
0.39
Activations Density 0.034%