INDEX
Explanations
references to fake news or misinformation
New Auto-Interp
Negative Logits
ब्रेकडाउन
-0.40
postIndex
-0.37
MessageTagHelper
-0.36
gyhoeddwyd
-0.34
identidad
-0.34
balleur
-0.34
ấp
-0.33
wikipagina
-0.32
identité
-0.32
pegno
-0.31
POSITIVE LOGITS
exaggerate
0.49
exaggerated
0.49
exagger
0.48
exaggeration
0.48
exaggerating
0.48
inflated
0.45
exager
0.44
المعيارى
0.44
Pad
0.43
faulty
0.43
Activations Density 0.897%