INDEX
Explanations
references to personal experiences or inquiries
New Auto-Interp
Negative Logits
<<<<<<<<<<<<<<
-0.76
Initializes
-0.57
Carcinogenicity
-0.53
weil
-0.53
cív
-0.53
Gouver
-0.52
Kości
-0.51
lendir
-0.51
wholes
-0.51
trouverez
-0.50
POSITIVE LOGITS
If
1.04
if
1.03
หาก
0.90
any
0.88
Jeśli
0.87
If
0.87
если
0.86
Если
0.86
eğer
0.86
Eğer
0.86
Activations Density 0.318%