INDEX
Explanations
New followed by places or things
new followed by type or void followed by function
New Auto-Interp
Negative Logits
.
1.03
are
0.72
the
0.71
↵
0.71
re
0.69
ate
0.66
that
0.65
ле
0.63
al
0.61
will
0.61
POSITIVE LOGITS
at
0.80
ک
0.76
ਰ
0.68
تم
0.67
was
0.66
кою
0.63
کس
0.61
0
0.61
باہر
0.60
م
0.60
Activations Density 0.123%