INDEX
Explanations
phrases related to decision-making and opinions
instances of the word "is"
New Auto-Interp
Negative Logits
prominently
-0.63
efully
-0.63
ected
-0.61
âĸ¬âĸ¬
-0.58
ASED
-0.57
PDATE
-0.57
ADRA
-0.57
chnology
-0.54
ãĤ´ãĥ³
-0.54
exting
-0.53
POSITIVE LOGITS
s
3.40
sb
1.67
sie
1.64
si
1.64
ski
1.56
ses
1.52
sa
1.51
sin
1.49
sburg
1.49
sg
1.48
Activations Density 0.420%