INDEX
Explanations
base level or as a foundation
New Auto-Interp
Negative Logits
ین
0.78
ар
0.72
ю
0.70
уви
0.68
zacz
0.67
لي
0.66
я
0.66
пробу
0.63
ید
0.63
испыта
0.63
POSITIVE LOGITS
Base
1.44
base
1.29
Base
1.27
base
1.18
for
1.05
BASE
1.05
BASE
0.90
बेस
0.90
Bases
0.90
ฐาน
0.87
Activations Density 0.042%