INDEX
Explanations
positive evaluation and praise
New Auto-Interp
Negative Logits
வதில்லை
0.42
dibawah
0.37
0.37
কখনই
0.36
0.35
consumes
0.35
animals
0.35
programs
0.34
unable
0.34
automóviles
0.34
POSITIVE LOGITS
excellent
0.69
excelente
0.66
commendable
0.66
insightful
0.66
impressively
0.66
excellente
0.66
thoughtful
0.64
admirably
0.64
很好
0.63
很好的
0.63
Activations Density 0.412%