INDEX
Explanations
phrases or sentences containing the word "misleading."
terms associated with misinformation and misleading information
New Auto-Interp
Negative Logits
dain
-0.70
DAY
-0.67
mun
-0.67
itar
-0.66
empl
-0.65
inct
-0.64
El
-0.63
%"
-0.63
Goal
-0.62
examination
-0.62
POSITIVE LOGITS
misled
1.33
mislead
1.31
misleading
1.25
deceive
1.14
deceived
1.05
dece
0.99
ingly
0.95
confuse
0.95
misrepresent
0.90
falsely
0.84
Activations Density 0.018%