INDEX
Explanations
actions or descriptions followed by details
New Auto-Interp
Negative Logits
Widers
0.42
fein
0.41
Low
0.40
వె
0.40
韦
0.40
ತಿಳಿದ
0.40
眉头
0.39
Tempt
0.39
décro
0.39
reed
0.38
POSITIVE LOGITS
ㅃ
0.50
ബോ
0.49
negativos
0.49
segmento
0.47
uso
0.47
imgur
0.47
podob
0.47
andır
0.46
vua
0.46
키
0.45
Activations Density 0.000%