INDEX
Explanations
certain special characters or types of characters in a document
specific non-English characters or symbols
New Auto-Interp
Negative Logits
ãĥ£
-0.86
zona
-0.78
EStreamFrame
-0.77
senal
-0.74
enfranch
-0.73
exha
-0.72
bucks
-0.71
goers
-0.71
ĪĴ
-0.70
merce
-0.69
POSITIVE LOGITS
½
0.88
ï¸ı
0.84
а
0.80
е
0.79
âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
0.79
iquette
0.77
ronic
0.76
idy
0.76
и
0.76
×ķ
0.75
Activations Density 0.011%