INDEX
Explanations
phrases indicating expectations or future actions
New Auto-Interp
Negative Logits
^(@)
-0.92
Efq
-0.86
adaptiveStyles
-0.83
Jefus
-0.83
cdti
-0.79
Houſe
-0.77
SBATCH
-0.77
للمعارف
-0.76
InitVars
-0.75
$_"
-0.74
POSITIVE LOGITS
according
1.62
according
1.53
According
1.51
According
1.47
Según
1.42
selon
1.35
Selon
1.34
Secondo
1.32
Según
1.30
Selon
1.30
Activations Density 0.112%