INDEX
Explanations
relating entities to action or direction
New Auto-Interp
Negative Logits
ov
0.53
自分が
0.50
自己的
0.50
read
0.48
Interchange
0.46
自分の
0.44
自己
0.44
h
0.44
om
0.43
op
0.43
POSITIVE LOGITS
deewana
0.61
into
0.50
إلى
0.50
toward
0.49
unscathed
0.49
complacent
0.48
weary
0.47
へと
0.47
vào
0.46
irresist
0.46
Activations Density 0.088%