INDEX
Explanations
mentions of social media URLs
New Auto-Interp
Negative Logits
corrections
-0.68
Levant
-0.67
assistants
-0.64
ÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤ
-0.62
scratches
-0.61
beginners
-0.61
distances
-0.60
repairs
-0.60
sanctions
-0.59
caring
-0.59
POSITIVE LOGITS
zx
1.20
1.11
zn
1.06
Uk
1.02
OY
1.00
bh
0.99
fb
0.99
uo
0.98
xus
0.98
qv
0.97
Activations Density 2.108%