INDEX
Explanations
phrases indicating manipulation, subversion, and the triggering of unrest or dissent
New Auto-Interp
Negative Logits
Tikang
-0.76
ModelExpression
-0.62
httphttps
-0.49
HomeAsUpEnabled
-0.48
AttributeSet
-0.48
WriteTagHelper
-0.47
انيف
-0.46
/**
-0.45
Exacts
-0.45
KURZBESCHREIBUNG
-0.45
POSITIVE LOGITS
estratégico
0.49
ThemeOverlay
0.46
tactic
0.45
psicológica
0.45
わざ
0.41
ligiloj
0.41
あえて
0.40
PAD
0.39
psicológico
0.39
purposely
0.38
Activations Density 0.829%