INDEX
Explanations
lines that include symbols related to coding or text formatting
symbols or special characters often used in informal or digital communication
New Auto-Interp
Negative Logits
scattering
-0.78
scatter
-0.72
dust
-0.67
zoning
-0.66
ensical
-0.64
wind
-0.63
monop
-0.63
itably
-0.63
stagger
-0.62
iliary
-0.62
POSITIVE LOGITS
¹
1.15
º
1.00
Į
0.95
į
0.93
»
0.93
£
0.92
¼
0.85
¬
0.84
§
0.84
²
0.84
Activations Density 0.720%