INDEX
Explanations
references to the word "whom"
New Auto-Interp
Negative Logits
-0.67
↵
-0.64
,
-0.59
(
-0.57
-0.57
Je
-0.57
Chi
-0.56
is
-0.55
ess
-0.55
los
-0.55
POSITIVE LOGITS
ainfi
0.93
avoient
0.92
plufieurs
0.87
Monfieur
0.86
miniaturka
0.85
wikipagina
0.84
<=",
0.82
indígen
0.81
feroit
0.81
desmotivaciones
0.79
Activations Density 0.402%