INDEX
Explanations
sequences containing specific characters, potentially as part of code or other specialized text
occurrences of a specific character or symbol, likely focusing on a particular character or motif throughout the document
New Auto-Interp
Negative Logits
geries
-0.94
distracting
-0.69
distracted
-0.69
sworth
-0.67
responders
-0.65
regul
-0.65
foreground
-0.64
tee
-0.63
luent
-0.63
raints
-0.62
POSITIVE LOGITS
и
1.05
ski
0.92
о
0.92
à¸
0.89
оÐ
0.88
Ñĥ
0.87
æŃ¦
0.85
âĸĦ
0.85
е
0.84
а
0.84
Activations Density 0.008%