INDEX
Explanations
phrases indicating timing or conditional situations
New Auto-Interp
Negative Logits
bezeichneter
-0.84
Fid
-0.78
Indah
-0.70
Shrewsbury
-0.67
Hush
-0.65
ArrowToggle
-0.64
افظة
-0.63
addChild
-0.60
bezit
-0.60
ます
-0.60
POSITIVE LOGITS
when
1.56
when
1.48
WHEN
1.46
When
1.39
When
1.36
WHEN
1.33
cuando
1.24
när
1.22
cuando
1.20
Cuando
1.16
Activations Density 0.129%