INDEX
Explanations
expressions of agreement and disagreement
New Auto-Interp
Negative Logits
Datuak
-0.59
presence
-0.57
忱
-0.55
->$
-0.55
shot
-0.52
Van
-0.50
forName
-0.50
abancı
-0.50
بواسطة
-0.49
Sol
-0.49
POSITIVE LOGITS
agreed
3.91
agree
3.87
agrees
3.54
Agree
3.48
agreeing
3.44
agreed
3.43
Agreed
3.40
agree
3.28
Agree
3.02
Agreed
2.87
Activations Density 0.084%