INDEX
Explanations
conditional statements indicating potential benefits or outcomes
New Auto-Interp
Negative Logits
æĹ¢
-0.20
rane
-0.18
instead
-0.17
eries
-0.16
instead
-0.14
altung
-0.14
eko
-0.14
seulement
-0.14
ovat
-0.14
nejen
-0.14
POSITIVE LOGITS
also
0.62
also
0.53
Also
0.47
Also
0.47
ALSO
0.43
juga
0.42
também
0.41
también
0.41
aussi
0.39
także
0.38
Activations Density 0.042%