INDEX
Explanations
themes of trust and deception in relationships and narratives
New Auto-Interp
Negative Logits
projected
-0.16
predicted
-0.15
olang
-0.15
//
-0.14
tej
-0.14
g
-0.14
Burke
-0.14
uros
-0.14
WR
-0.13
ABS
-0.13
POSITIVE LOGITS
dorf
0.17
serter
0.16
_Generic
0.15
nier
0.14
æ¢
0.14
_GC
0.14
еÑĢед
0.14
çİī
0.14
/rfc
0.14
PFN
0.14
Activations Density 0.192%