INDEX
Explanations
describing or framing something
New Auto-Interp
Negative Logits
แล้ว
0.52
όπως
0.48
그리고
0.47
ហើយ
0.47
säger
0.47
Dazu
0.46
ይቀ
0.46
然后
0.45
ထ
0.45
وتح
0.45
POSITIVE LOGITS
being
1.02
having
1.00
being
0.98
étant
0.95
having
0.87
olev
0.75
needing
0.73
étant
0.71
requiring
0.67
Being
0.66
Activations Density 0.061%