INDEX
Explanations
instances of deception and hidden truths in human behavior and societal norms
New Auto-Interp
Negative Logits
quat
-0.16
[now
-0.15
/REC
-0.14
ToPoint
-0.14
"class
-0.14
locator
-0.13
izzas
-0.13
å»
-0.13
å¡ļ
-0.13
//{{-0.13
POSITIVE LOGITS
underlying
0.56
underneath
0.49
behind
0.46
beneath
0.42
hidden
0.39
Behind
0.36
Behind
0.34
Bene
0.31
-hidden
0.31
hidden
0.30
Activations Density 0.232%