INDEX
Explanations
punctuation and sentence endings
New Auto-Interp
Negative Logits
GOTREF
-0.90
SharedDtor
-0.83
AssemblyCompany
-0.82
שוליים
-0.82
ंदीखरीदारी
-0.81
IsMutable
-0.79
ftagPool
-0.75
noDo
-0.74
+#+#
-0.74
فريبيس
-0.73
POSITIVE LOGITS
Furthermore
0.64
Because
0.62
Nonetheless
0.60
Furthermore
0.59
Because
0.59
Despite
0.56
Despite
0.55
Nonetheless
0.52
because
0.50
Nevertheless
0.50
Activations Density 0.032%