INDEX
Explanations
questions and statements referring to existence or presence of certain aspects or conditions
New Auto-Interp
Negative Logits
©¶æ¥µ
-0.78
ruciating
-0.72
abor
-0.72
CTV
-0.71
yssey
-0.70
yk
-0.68
Ãį
-0.66
ulner
-0.66
inery
-0.65
ieves
-0.65
POSITIVE LOGITS
?:
1.30
?'
1.22
?
1.21
...?
1.18
?'"
1.13
?)
1.13
?"
1.12
?).
1.06
?",
1.04
?),
1.03
Activations Density 0.188%