INDEX
Explanations
mentions of fake news and media literacy
New Auto-Interp
Negative Logits
Predecesor
-0.40
HOR
-0.35
außer
-0.34
pê
-0.33
delegate
-0.32
Aholisi
-0.32
başladı
-0.32
HOR
-0.32
길
-0.31
デューサー
-0.31
POSITIVE LOGITS
disinformation
0.62
misinformation
0.60
########.
0.53
twimg
0.53
HasFactory
0.52
ligiloj
0.52
IsContent
0.51
RenderAtEndOf
0.49
cupertino
0.49
برانيه
0.48
Activations Density 0.016%