INDEX
Explanations
phrases indicating uncertainty or complexity in discussions, particularly around societal issues and diverse perspectives
New Auto-Interp
Negative Logits
-0.54
مرئيه
-0.53
ikistan
-0.52
Hochspringen
-0.52
Савезне
-0.51
szóci
-0.51
BoxFit
-0.50
ruptedException
-0.50
➯
-0.49
RegressionTest
-0.49
POSITIVE LOGITS
-
0.44
exitRule
0.43
iprot
0.40
Karlsson
0.40
[
0.39
:
0.39
surla
0.38
correct
0.37
országban
0.36
VERY
0.35
Activations Density 1.343%