INDEX
Explanations
references to surveys and research methodologies
New Auto-Interp
Negative Logits
usalem
-0.15
aura
-0.15
ruba
-0.14
tera
-0.14
ingly
-0.14
anni
-0.14
Kou
-0.14
.micro
-0.14
pone
-0.14
unist
-0.13
POSITIVE LOGITS
SAMPLE
0.16
SAMPLE
0.15
ìĥ
0.15
adle
0.14
samples
0.14
Gamer
0.14
ATUS
0.14
edu
0.13
ìĥ
0.13
248
0.13
Activations Density 0.020%