INDEX
Explanations
significant claims or statements that seem improbable or extraordinary
New Auto-Interp
Negative Logits
unspecified
-0.16
eniable
-0.15
ÏĢοÏĦε
-0.15
ynn
-0.15
ektor
-0.15
ÑįÑĦÑĦек
-0.14
enderror
-0.14
undecided
-0.14
icontrol
-0.14
inde
-0.14
POSITIVE LOGITS
Impossible
0.22
unlikely
0.22
impossible
0.20
odds
0.20
unlikely
0.20
Impossible
0.20
improbable
0.19
impro
0.18
imposs
0.18
aud
0.18
Activations Density 0.176%