INDEX
Explanations
variable assignments and labels
New Auto-Interp
Negative Logits
@@]
0.38
лася
0.37
برق
0.36
也会
0.35
iftoire
0.35
⤵
0.34
νοντας
0.34
balsam
0.33
otherArchive
0.33
σταν
0.33
POSITIVE LOGITS
The
0.51
represents
0.49
=
0.47
The
0.47
Represents
0.42
0.42
ise
0.42
0.39
or
0.39
denotes
0.39
Activations Density 0.101%