INDEX
Explanations
references to formal communications or documentation
New Auto-Interp
Negative Logits
659
-0.19
γά
-0.18
ëłµ
-0.16
cri
-0.15
eko
-0.15
yen
-0.14
ought
-0.13
aid
-0.13
åľ
-0.13
åıĹ
-0.13
POSITIVE LOGITS
saying
0.28
about
0.24
pur
0.23
stating
0.20
regarding
0.19
asking
0.18
addressed
0.18
entitled
0.18
concerning
0.17
indicating
0.16
Activations Density 0.241%