INDEX
Explanations
expressions of confirmation or statements made during interviews
New Auto-Interp
Negative Logits
.mas
-0.17
itoris
-0.16
imizer
-0.15
ween
-0.15
enaire
-0.15
links
-0.14
ippi
-0.14
etur
-0.14
plorer
-0.14
ÏģαÏĤ
-0.14
POSITIVE LOGITS
outers
0.15
Guerr
0.14
CTS
0.14
ÑĢеж
0.14
hazi
0.13
Scheme
0.13
693
0.13
mediation
0.13
auses
0.13
opr
0.13
Activations Density 0.069%