INDEX
Explanations
references to legal claims and accusations
New Auto-Interp
Negative Logits
olta
-0.14
Forecast
-0.13
oka
-0.13
ynes
-0.13
esser
-0.13
ानत
-0.12
елÑİ
-0.12
Mezi
-0.12
otas
-0.12
ç·ł
-0.12
POSITIVE LOGITS
deny
0.42
denies
0.41
denied
0.40
denying
0.37
denial
0.37
deny
0.31
DEN
0.31
vehement
0.31
defended
0.30
refute
0.30
Activations Density 0.199%