INDEX
Explanations
references to soap
references to soap and its production
New Auto-Interp
Negative Logits
issued
-0.68
etermined
-0.67
sen
-0.65
interrupted
-0.64
trl
-0.62
sworn
-0.61
ooming
-0.61
classified
-0.61
rals
-0.60
uding
-0.60
POSITIVE LOGITS
soap
1.30
opera
1.08
oper
0.98
ãĤ¦ãĤ¹
0.94
glers
0.84
yarn
0.80
utical
0.79
Pengu
0.77
Bunny
0.75
combe
0.74
Activations Density 0.009%