INDEX
Explanations
assertions or phrases related to taking action or making choices
New Auto-Interp
Negative Logits
librement
-0.61
swt
-0.61
laikā
-0.60
lardır
-0.59
vestiti
-0.58
leçons
-0.58
mnoho
-0.57
träd
-0.56
feroit
-0.56
sostegno
-0.55
POSITIVE LOGITS
оригіналу
0.96
Chham
0.78
kaarangay
0.72
:]:
0.69
ьаж
0.67
насеље
0.67
beginnetje
0.67
IsMutable
0.67
hit
0.66
hits
0.65
Activations Density 0.553%