INDEX
Explanations
email addresses and website URLs
email addresses and social media handles
New Auto-Interp
Negative Logits
Reduction
-1.03
Tenth
-1.01
Disorder
-1.00
Ninth
-0.98
Lines
-0.97
Fin
-0.94
Direction
-0.93
Index
-0.93
Finder
-0.93
Hole
-0.92
POSITIVE LOGITS
olson
1.07
americ
1.05
izabeth
0.92
_
0.92
football
0.91
brown
0.91
pod
0.89
@
0.86
clinton
0.85
ensis
0.84
Activations Density 0.174%