INDEX
Explanations
formal phrasing and transformations
New Auto-Interp
Negative Logits
्याज
0.91
Lonely
0.84
soprano
0.84
ᑎ
0.83
不是
0.81
ᕝ
0.81
escritor
0.80
chibi
0.80
ᓂ
0.79
Caffeine
0.78
POSITIVE LOGITS
Moreover
0.78
…
0.78
Furthermore
0.77
...”
0.76
].”
0.73
...
0.73
[
0.73
[...]
0.72
میکنند
0.72
…
0.71
Activations Density 0.955%