INDEX
Explanations
the presence of punctuation marks that indicate the end of sentences
New Auto-Interp
Negative Logits
981
-0.17
azz
-0.14
van
-0.14
-eng
-0.14
#Region
-0.14
بÙĨا
-0.13
ело
-0.13
âĢŀM
-0.13
端
-0.13
orden
-0.12
POSITIVE LOGITS
Die
0.32
Es
0.28
Das
0.26
Die
0.26
Im
0.26
Als
0.24
Dar
0.24
Dies
0.23
Da
0.23
Der
0.23
Activations Density 0.022%