INDEX
Explanations
important decision or action
New Auto-Interp
Negative Logits
describe
0.82
Describe
0.80
Describe
0.78
惀
0.77
پورا
0.76
englisch
0.75
problemler
0.72
describir
0.71
describe
0.70
aru
0.70
POSITIVE LOGITS
move
2.86
moves
2.65
move
2.32
Move
2.19
moves
2.18
gesture
2.13
Move
2.11
steps
2.11
পদক্ষেপ
2.03
actions
2.00
Activations Density 0.462%