INDEX
Explanations
references to indoctrination and manipulation of beliefs
New Auto-Interp
Negative Logits
AutoField
-0.51
Meksiku
-0.49
rungsseite
-0.46
Билгалдахарш
-0.46
ब्रेकडाउन
-0.46
insics
-0.46
offsetof
-0.44
modelBuilder
-0.44
balleur
-0.43
Controllo
-0.43
POSITIVE LOGITS
impression
0.53
propaganda
0.50
impressions
0.49
impression
0.48
indoctr
0.48
insegna
0.47
Propaganda
0.47
vermittelt
0.47
messaggio
0.46
InputDecoration
0.46
Activations Density 0.310%