INDEX
Explanations
texts discussing characteristics or traits of individuals
concepts related to psychological traits and behaviors
New Auto-Interp
Negative Logits
Canaver
-0.75
GOODMAN
-0.53
ãĤ´ãĥ³
-0.51
$.
-0.50
ABV
-0.49
disclaimer
-0.47
Spoiler
-0.47
¶
-0.46
BAT
-0.46
BILITIES
-0.45
POSITIVE LOGITS
)"
0.75
)",
0.73
");
0.65
";
0.63
','
0.56
)."
0.55
\",
0.53
;"
0.53
),"
0.53
['
0.53
Activations Density 1.811%