INDEX
Explanations
Native tokens and their appearance
New Auto-Interp
Negative Logits
look
0.45
Look
0.42
Look
0.41
looking
0.36
M
0.36
look
0.36
tr
0.36
نگاه
0.35
(
0.35
間
0.35
POSITIVE LOGITS
erscheinen
0.55
appears
0.54
Appear
0.53
aparecer
0.52
erscheint
0.52
apparaissent
0.52
ظاهر
0.52
appears
0.51
詑
0.51
erschienen
0.50
Activations Density 0.000%