INDEX
Explanations
function definitions and results
New Auto-Interp
Negative Logits
λα
1.41
Μα
1.38
swoją
1.35
μαγγ
1.34
birçok
1.34
த்ரே
1.34
zahlreiche
1.33
regiões
1.33
<unused409>
1.32
யோ
1.31
POSITIVE LOGITS
<eos>
1.78
↵↵↵↵
1.05
↵↵↵
0.98
.</
0.95
↵↵↵↵↵
0.95
៕
0.92
<start_of_image>
0.91
↵↵
0.90
↵↵↵↵↵↵
0.88
。<
0.86
Activations Density 0.024%