INDEX
Explanations
references to deliberate actions and their consequences, particularly in the context of violence or injury
New Auto-Interp
Negative Logits
ibil
-0.18
iqu
-0.15
ique
-0.15
accompany
-0.14
posure
-0.14
otland
-0.14
orthand
-0.14
éĿ©
-0.14
firefight
-0.14
blink
-0.14
POSITIVE LOGITS
hit
0.39
ram
0.32
hit
0.31
hits
0.30
Hit
0.30
Hit
0.28
HIT
0.28
æĴ
0.27
hitting
0.27
-hit
0.26
Activations Density 0.076%