INDEX
Explanations
explaining relationships or functions
New Auto-Interp
Negative Logits
또는
0.42
または
0.39
0.37
ظام
0.37
ול
0.36
hoặc
0.35
kunt
0.35
nebo
0.34
หรือ
0.34
или
0.34
POSITIVE LOGITS
itself
0.78
excellently
0.73
perfectly
0.72
nicely
0.72
wonderfully
0.69
отлично
0.69
awfully
0.68
beautifully
0.64
很大
0.62
很多人
0.60
Activations Density 0.428%