INDEX
Explanations
digits like 7 or 9
numbers, dosages, people
New Auto-Interp
Negative Logits
t
0.82
(
0.80
не
0.66
եմ
0.66
ει
0.65
。
0.63
ीकृत
0.62
experimentado
0.61
і
0.61
𝐭
0.61
POSITIVE LOGITS
for
1.05
in
1.00
↵
0.92
在
0.89
at
0.85
TO
0.82
Have
0.78
در
0.76
RE
0.73
في
0.73
Activations Density 0.651%