INDEX
Explanations
expressions of understanding or awareness
New Auto-Interp
Negative Logits
dAtA
-0.80
avice
-0.65
PositiveButton
-0.63
SharedDtor
-0.61
nevoie
-0.58
ویکیپدیای
-0.57
DebuggerNonUser
-0.57
centralised
-0.56
EndInit
-0.56
mín
-0.56
POSITIVE LOGITS
know
0.80
ho
0.67
Know
0.63
itrile
0.63
Sabes
0.63
know
0.62
Cosby
0.61
مشين
0.61
KNOW
0.59
ఔ
0.58
Activations Density 0.050%