INDEX
Explanations
negations and evaluative expressions
New Auto-Interp
Negative Logits
ÑģÑĤи
-0.16
528
-0.15
Boss
-0.14
ARS
-0.14
záp
-0.14
ulario
-0.13
drs
-0.13
eras
-0.13
ario
-0.13
ayed
-0.13
POSITIVE LOGITS
θμ
0.16
oya
0.15
Patterson
0.15
ugu
0.15
legg
0.14
captures
0.14
ugins
0.14
ownik
0.14
SetProperty
0.14
ëŀĢ
0.14
Activations Density 0.000%