INDEX
Explanations
references to literature and works of authors
New Auto-Interp
Negative Logits
racuse
-0.89
Philo
-0.78
Rufus
-0.78
dtd
-0.77
Sami
-0.77
Ruf
-0.73
Sami
-0.72
Ribera
-0.72
Yoon
-0.70
varez
-0.70
POSITIVE LOGITS
Dö
0.92
Georgie
0.92
axel
0.90
Aer
0.88
orses
0.88
waite
0.85
MSM
0.83
Vey
0.83
Fie
0.81
Berwick
0.81
Activations Density 2.405%