INDEX
Explanations
mentions of specific organizations or titles related to social media
New Auto-Interp
Negative Logits
PageRoute
-0.17
afone
-0.16
abi
-0.15
üz
-0.14
ardi
-0.14
irth
-0.14
ynom
-0.14
ionage
-0.14
inski
-0.13
adelphia
-0.13
POSITIVE LOGITS
loss
0.19
lost
0.17
Loss
0.17
loss
0.17
-loss
0.16
outside
0.16
Outside
0.16
LOSS
0.15
Loss
0.15
losses
0.15
Activations Density 0.006%