INDEX
Explanations
sections of text formatted with specific characters or symbols
New Auto-Interp
Negative Logits
Fant
-0.67
Aki
-0.67
Duane
-0.67
ing
-0.67
оригіналу
-0.66
:✨
-0.66
áklad
-0.65
ة
-0.64
lapsible
-0.63
ceto
-0.63
POSITIVE LOGITS
}}
2.48
}}
1.81
"}}
1.77
'}}
1.73
.}}
1.64
()}}
1.48
$}}
1.39
)}}
1.32
}}}}
1.29
}}}
1.27
Activations Density 0.224%