INDEX
Explanations
phrases related to directional changes or transformations
New Auto-Interp
Negative Logits
tableau
-0.47
prio
-0.43
estu
-0.42
UrlResolution
-0.42
merk
-0.41
financiers
-0.41
Molto
-0.41
luence
-0.41
verz
-0.40
exploded
-0.40
POSITIVE LOGITS
turn
0.93
Turn
0.89
turn
0.89
turns
0.84
Turn
0.82
TURN
0.82
TURN
0.79
turns
0.79
Turns
0.68
Turned
0.68
Activations Density 0.397%