INDEX
Explanations
phrases indicating compensatory actions or attributes in contexts involving performance and education
New Auto-Interp
Negative Logits
iba
-0.15
anas
-0.14
bulan
-0.14
ī
-0.14
ima
-0.14
apt
-0.14
981
-0.14
еж
-0.13
IMA
-0.13
_NC
-0.13
POSITIVE LOGITS
lost
0.40
Lost
0.33
Lost
0.32
losses
0.32
lost
0.32
compensate
0.28
_lost
0.28
loss
0.28
loss
0.26
Loss
0.24
Activations Density 0.020%