INDEX
Explanations
quantitative metrics and statistical data
New Auto-Interp
Negative Logits
ÙħÙĪÙĦ
-0.17
essel
-0.16
orna
-0.16
§
-0.15
aph
-0.15
hani
-0.15
ifiers
-0.15
íĻ©
-0.14
ischer
-0.14
íĻĶ
-0.14
POSITIVE LOGITS
L
0.16
Ping
0.14
alls
0.14
еÑĢе
0.14
eres
0.14
ldb
0.14
Alto
0.14
üle
0.14
Medic
0.14
Sandwich
0.14
Activations Density 0.035%