INDEX
Explanations
phrases indicating conditional situations or qualifications
New Auto-Interp
Negative Logits
odium
-0.18
uhn
-0.15
bound
-0.15
ontrol
-0.14
ZE
-0.14
abinet
-0.14
rame
-0.14
ÎĶε
-0.14
Bound
-0.14
_bound
-0.14
POSITIVE LOGITS
RelativeTo
0.16
exclus
0.16
cÄĥn
0.16
-o
0.15
xBC
0.15
menn
0.15
apas
0.14
nghiá»ĩp
0.14
ÑĨа
0.14
hosp
0.14
Activations Density 0.610%