INDEX
Explanations
numerical values and associated parameters
New Auto-Interp
Negative Logits
\(
-0.14
ÙĪØ«
-0.13
Crafts
-0.13
hani
-0.13
eson
-0.13
olicited
-0.13
atego
-0.13
vara
-0.13
_fence
-0.13
lee
-0.12
POSITIVE LOGITS
<
0.56
<
0.27
><
0.22
"><
0.22
><
0.21
<$
0.20
<↵
0.20
/<
0.19
=""><
0.19
(<
0.18
Activations Density 0.043%