INDEX
Explanations
entities knowing or wanting things
New Auto-Interp
Negative Logits
তিনি
0.35
তিনিও
0.29
ergibt
0.29
Means
0.28
Brings
0.28
beliau
0.27
会导致
0.27
означает
0.27
導致
0.27
তিনি
0.27
POSITIVE LOGITS
itself
0.47
knows
0.37
wants
0.35
believes
0.33
자체
0.33
recognizes
0.32
understands
0.32
knew
0.32
considers
0.31
thinks
0.31
Activations Density 0.139%