INDEX
Explanations
occurrences of punctuation marks and sentence endings
New Auto-Interp
Negative Logits
accept
-0.64
tolerate
-0.62
tol
-0.61
tolerated
-0.55
accepts
-0.53
accepter
-0.51
toler
-0.50
accepting
-0.49
Kor
-0.48
er
-0.48
POSITIVE LOGITS
ьаж
0.93
ThroughAttribute
0.90
RegressionTest
0.83
houſe
0.80
itſelf
0.80
raiſ
0.79
doubtnut
0.78
disambiguazione
0.78
greateſt
0.77
photolibrary
0.77
Activations Density 0.055%