INDEX
Explanations
social media or forum profile information
instances of user account or profile information
New Auto-Interp
Negative Logits
andro
-0.85
othy
-0.81
ouf
-0.76
agos
-0.72
ider
-0.70
atically
-0.69
utenberg
-0.68
othe
-0.67
streets
-0.66
iren
-0.66
POSITIVE LOGITS
Joined
1.57
Joined
0.98
Rankings
0.83
76561
0.82
âĸ¬
0.79
ertodd
0.78
âĸ¬âĸ¬
0.77
pedia
0.75
uates
0.71
Offline
0.70
Activations Density 0.005%