INDEX
Explanations
expressions of frustration and injustice related to socioeconomic conditions
New Auto-Interp
Negative Logits
ModelExpression
-0.90
jsonwebtoken
-0.80
surla
-0.80
nakalista
-0.71
дописавши
-0.70
abestanden
-0.67
)?;
-0.66
MLLoader
-0.66
transfieras
-0.65
ویکیپدیا
-0.64
POSITIVE LOGITS
SearchView
0.45
forced
0.44
Worse
0.42
alors
0.41
Worse
0.40
приходится
0.40
combatt
0.39
jenost
0.39
Constant
0.39
Harp
0.38
Activations Density 0.376%