INDEX
Explanations
phrases expressing the futility or ineffectiveness of actions
doing and its effects
New Auto-Interp
Negative Logits
ſſung
-0.81
незавершена
-0.81
unſer
-0.79
<unused17>
-0.77
dieſer
-0.77
<unused41>
-0.76
<unused74>
-0.76
<unused20>
-0.76
[@BOS@]
-0.76
<unused14>
-0.76
POSITIVE LOGITS
effects
0.40
effect
0.38
effets
0.36
effect
0.35
impact
0.35
करे
0.32
effet
0.31
does
0.31
effective
0.31
power
0.30
Activations Density 0.046%