INDEX
Explanations
lists of entities and their descriptions
New Auto-Interp
Negative Logits
:
0.59
7
0.48
و
0.48
.
0.41
வ்வேறு
0.40
зульта
0.40
ﻔ
0.38
5
0.38
4
0.38
ता
0.37
POSITIVE LOGITS
from
0.75
on
0.63
of
0.59
with
0.57
at
0.53
k
0.52
out
0.48
to
0.46
r
0.45
n
0.44
Activations Density 1.967%