INDEX
Explanations
specific terminology related to educational or programming contexts
New Auto-Interp
Negative Logits
isay
-0.15
thers
-0.13
باÙĨ
-0.13
amel
-0.13
UCCEEDED
-0.13
稿
-0.13
while
-0.12
odem
-0.12
trib
-0.12
.zone
-0.12
POSITIVE LOGITS
because
0.55
because
0.51
porque
0.45
ï¼ĮåĽłä¸º
0.43
Because
0.43
Because
0.42
åĽłä¸º
0.40
perché
0.38
karena
0.37
ecause
0.37
Activations Density 0.002%