INDEX
Explanations
specific nouns related to various topics, such as animals, technology, and physical attributes
various descriptors and nouns associated with the color red and concepts of humor or whimsy
New Auto-Interp
Negative Logits
Niet
-0.67
Borders
-0.67
prest
-0.58
Azerb
-0.56
SPONSORED
-0.56
Kahn
-0.55
Nanto
-0.54
Berk
-0.54
Seym
-0.53
Nare
-0.53
POSITIVE LOGITS
][
0.78
]
0.75
)
0.74
::
0.72
_
0.71
:=
0.71
);
0.69
+=
0.67
):
0.66
))
0.65
Activations Density 0.312%