INDEX
Explanations
instances of strong opinions and rhetorical questioning in a debate or discussion context
New Auto-Interp
Negative Logits
روابط
-0.72
னர்
-0.66
letar
-0.52
Portail
-0.52
Kirche
-0.51
respectivamente
-0.51
لينا
-0.51
AccessorTable
-0.50
Statistiche
-0.50
urman
-0.49
POSITIVE LOGITS
it
0.81
they
0.66
AndEndTag
0.61
それは
0.55
Wikispecies
0.55
its
0.50
surely
0.50
оно
0.49
then
0.47
ujednoznacz
0.47
Activations Density 0.375%