INDEX
Explanations
special characters with specific numerical values that seem to be unique identifiers in the text
the occurrence of the character 'Ģ'
New Auto-Interp
Negative Logits
raints
-0.77
ifications
-0.63
enegger
-0.62
ified
-0.60
papers
-0.59
wid
-0.59
birth
-0.58
patches
-0.58
papers
-0.58
opian
-0.57
POSITIVE LOGITS
Ģ
1.34
âĶĢâĶĢâĶĢâĶĢ
0.94
âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
0.93
λ
0.90
ĭ
0.89
OTE
0.88
ģ
0.88
ilus
0.86
hoe
0.82
âĶĢâĶĢ
0.82
Activations Density 0.004%