INDEX
Explanations
information related to detailed measurements or distances
references to pledging or supporting projects
New Auto-Interp
Negative Logits
Woody
-0.64
uv
-0.62
masculinity
-0.62
Chomsky
-0.60
deport
-0.58
bodily
-0.56
politics
-0.56
animate
-0.56
Born
-0.54
Kissinger
-0.54
POSITIVE LOGITS
.).
0.88
)).
0.84
`.
0.82
>.
0.82
}.
0.82
!).
0.77
]).
0.76
().
0.75
).
0.72
.</
0.72
Activations Density 1.460%