INDEX
Explanations
physical attributes and actions
New Auto-Interp
Negative Logits
西
0.47
績
0.46
0.45
南
0.45
留
0.44
Provide
0.41
ال
0.40
Katal
0.40
保
0.39
ە
0.39
POSITIVE LOGITS
líquido
0.48
graft
0.48
cukup
0.47
carbono
0.46
odu
0.46
aprend
0.45
gespre
0.45
સિંહ
0.45
剤
0.45
happ
0.44
Activations Density 0.002%