INDEX
Explanations
email addresses
email addresses and domain names
New Auto-Interp
Negative Logits
ancest
-0.84
gling
-0.64
tein
-0.63
Phelps
-0.62
OTT
-0.62
LOS
-0.62
Franch
-0.62
fitting
-0.61
fully
-0.61
IAL
-0.61
POSITIVE LOGITS
gmail
1.08
0.88
yahoo
0.87
bugs
0.75
ileaks
0.75
inx
0.75
username
0.74
zzle
0.74
agna
0.71
messenger
0.70
Activations Density 0.013%