INDEX
Explanations
auxiliary verbs and negations
New Auto-Interp
Negative Logits
hooks
0.40
Esses
0.37
Hooks
0.35
Applies
0.34
pins
0.34
Assists
0.34
Assault
0.33
S
0.33
XIV
0.33
为主
0.33
POSITIVE LOGITS
hacerlo
0.50
według
0.44
ঝুঁক
0.42
确实
0.41
殚
0.40
indeed
0.39
aurants
0.39
tijdens
0.39
pretože
0.38
farlo
0.38
Activations Density 0.012%