INDEX
Explanations
taking responsibility and action
New Auto-Interp
Negative Logits
ச்சூழ
0.55
3
0.54
úncia
0.53
URCH
0.51
لوبو
0.50
pré
0.49
gebe
0.48
সম্ভাব্য
0.48
செய்யப்பட்ட
0.47
വ്യാപ
0.47
POSITIVE LOGITS
took
0.85
advantage
0.83
take
0.80
taken
0.79
Take
0.71
taking
0.71
TAKE
0.69
a
0.67
Taking
0.67
takers
0.66
Activations Density 0.065%