INDEX
Explanations
expressions of comparison or contrast
New Auto-Interp
Negative Logits
rane
-0.16
lle
-0.15
Boeh
-0.14
ucas
-0.13
ulating
-0.13
anders
-0.13
Enlarge
-0.13
εια
-0.13
OURS
-0.13
addChild
-0.13
POSITIVE LOGITS
also
0.27
sino
0.25
also
0.24
Also
0.23
Also
0.22
but
0.22
بÙĦÚ©Ùĩ
0.21
sondern
0.21
também
0.19
también
0.18
Activations Density 0.026%