INDEX
Explanations
contact information and web links
New Auto-Interp
Negative Logits
Jungle
-0.90
Moonlight
-0.75
revenge
-0.75
selfies
-0.69
Sour
-0.69
vengeance
-0.66
Diesel
-0.66
Cinderella
-0.66
shaving
-0.64
Ghost
-0.63
POSITIVE LOGITS
gov
1.41
edu
1.31
ucl
1.14
org
1.05
uckland
0.95
ournals
0.93
central
0.92
western
0.91
debian
0.87
microsoft
0.87
Activations Density 0.046%