INDEX
Explanations
instances of the character "ĉ" or variations in styling
emotional responses and reactions in text
New Auto-Interp
Head Attr Weights
0:0.03
1:0.04
2:0.02
3:0.05
4:0.03
5:0.06
6:0.02
7:0.07
8:0.11
9:0.02
10:0.03
11:0.47
Negative Logits
‐
-4.02
-3.89
)—
-3.48
—
-3.40
—"
-3.24
—-
-3.19
"…
-3.13
–
-2.86
geo
-2.83
organis
-2.78
POSITIVE LOGITS
JUST
5.60
`.
5.48
----------------------------------------------------------------
5.14
..."
4.72
[...]
4.71
........
4.70
...]
4.67
``
4.62
`
4.59
--------------------------------
4.46
Activations Density 0.003%