INDEX
Explanations
references to prestigious educational institutions and their connections
New Auto-Interp
Negative Logits
Wheatley
-0.91
Hayward
-0.75
Coates
-0.75
libri
-0.71
vell
-0.71
Dent
-0.70
stuhl
-0.69
GNA
-0.68
Galla
-0.67
Rheinland
-0.66
POSITIVE LOGITS
Normdatei
0.83
Danilo
0.74
Connecticut
0.74
Knicks
0.72
Connecticut
0.70
Lama
0.69
Hano
0.68
дописавши
0.67
Childs
0.67
dtos
0.67
Activations Density 1.687%