INDEX
Explanations
words that indicate causality or implications
New Auto-Interp
Negative Logits
SBATCH
-0.66
رشف
-0.66
Cæsar
-0.66
sight
-0.66
Bela
-0.65
agrarian
-0.64
Gher
-0.63
utafitiHapana
-0.63
cactus
-0.61
Shakspeare
-0.61
POSITIVE LOGITS
所以在
0.65
demek
0.60
many
0.59
So
0.59
HasForeignKey
0.58
oporosis
0.58
بنابراین
0.58
といって
0.58
so
0.57
HideInInspector
0.57
Activations Density 0.318%