INDEX
Explanations
I express personal states and actions
New Auto-Interp
Negative Logits
proactively
0.47
गेटिव
0.46
अगर
0.45
していきます
0.44
completamente
0.44
gets
0.43
နဲ့
0.43
synergy
0.43
거라고
0.43
basically
0.43
POSITIVE LOGITS
shall
0.68
scarcely
0.66
dares
0.63
dare
0.61
tremble
0.59
speak
0.58
trembled
0.56
shan
0.56
assure
0.55
cannot
0.52
Activations Density 0.020%