INDEX
Explanations
references to measures or scores in evaluations or rankings
New Auto-Interp
Negative Logits
лÑıв
-0.16
Wagner
-0.15
Cole
-0.15
ôle
-0.14
â̦
-0.14
wd
-0.13
Ole
-0.13
fasting
-0.13
213
-0.13
Gh
-0.13
POSITIVE LOGITS
Public
0.34
PUBLIC
0.31
Public
0.30
ken
0.28
public
0.26
/Public
0.25
.Public
0.25
_public
0.24
_PUBLIC
0.23
PUBLIC
0.23
Activations Density 0.001%