INDEX
Explanations
comparisons of actions or levels of achievement
comparisons regarding the impact of actions or influences on various subjects
New Auto-Interp
Negative Logits
Users
-0.69
Sah
-0.68
icas
-0.64
held
-0.64
Fever
-0.63
Height
-0.62
urated
-0.62
Parameters
-0.62
resses
-0.62
icia
-0.61
POSITIVE LOGITS
damage
1.09
harm
1.08
homework
1.05
chores
1.04
groundwork
0.90
work
0.89
mischief
0.88
research
0.88
reconnaissance
0.87
outreach
0.85
Activations Density 0.104%