INDEX
Explanations
reasons or justifications for certain actions or phenomena
questions and justifications for reasons
New Auto-Interp
Negative Logits
ivalent
-0.75
isha
-0.73
ona
-0.73
urated
-0.72
iership
-0.72
ail
-0.71
lude
-0.68
favourites
-0.67
nails
-0.67
nik
-0.66
POSITIVE LOGITS
scarcity
0.79
misunderstand
0.78
ecause
0.77
Because
0.76
Firstly
0.76
arcity
0.73
Because
0.71
incent
0.68
cause
0.67
antitrust
0.65
Activations Density 0.270%