INDEX
Explanations
representing or defining concepts
New Auto-Interp
Negative Logits
you
0.92
você
0.84
vocês
0.76
you
0.76
You
0.73
Você
0.70
You
0.70
kalian
0.69
আপনারা
0.68
?),
0.66
POSITIVE LOGITS
representa
1.03
representan
1.00
rappresenta
0.96
represents
0.95
represent
0.95
remain
0.93
representando
0.91
representing
0.91
representing
0.91
rappresent
0.91
Activations Density 0.173%