INDEX
Explanations
all followed by descriptive verbs
New Auto-Interp
Negative Logits
או
0.45
or
0.45
hoặc
0.44
或
0.43
),
0.43
ו
0.43
arcs
0.42
অথবা
0.41
jede
0.41
manuals
0.40
POSITIVE LOGITS
in
0.41
ay
0.40
ong
0.38
istä
0.37
of
0.36
ickey
0.34
在
0.34
hugely
0.34
ati
0.33
जगह
0.33
Activations Density 0.026%