INDEX
Explanations
strong affirmation or significant change
New Auto-Interp
Negative Logits
R
0.57
Р
0.56
М
0.56
К
0.56
Ф
0.55
К
0.53
}}$
0.52
Х
0.51
)}
0.51
М
0.51
POSITIVE LOGITS
absolutamente
0.90
unquestionably
0.84
fundamentally
0.83
unequivocally
0.79
ANYTHING
0.78
utterly
0.76
downright
0.75
praticamente
0.75
manifestly
0.74
proprement
0.73
Activations Density 4.798%