INDEX
Explanations
instances of the word "backwards" in a text
the concept of "backwards" or actions performed in reverse
New Auto-Interp
Negative Logits
raltar
-0.74
"},"
-0.72
AIN
-0.70
illary
-0.70
riz
-0.68
ULE
-0.65
ittee
-0.64
========
-0.63
tein
-0.63
te
-0.61
POSITIVE LOGITS
stairs
0.93
backward
0.90
actively
0.90
gradation
0.89
ward
0.86
Ħ¢
0.84
wards
0.84
dash
0.83
compat
0.79
backwards
0.77
Activations Density 0.006%