INDEX
Explanations
specific Twitter usernames
specific annotations or metadata related to concepts or entities
New Auto-Interp
Negative Logits
emb
-0.80
Esper
-0.76
UX
-0.75
emb
-0.75
cham
-0.74
Chung
-0.74
ume
-0.72
Amit
-0.71
gaard
-0.68
achus
-0.68
POSITIVE LOGITS
R
1.49
Rs
1.34
R
1.29
Ri
1.29
r
1.26
RP
1.24
RS
1.24
Rak
1.24
RI
1.20
RR
1.19
Activations Density 0.674%