INDEX
Explanations
adjectives describing something as beneficial, practical, or advantageous
New Auto-Interp
Negative Logits
otta
-0.65
boarded
-0.65
uph
-0.64
Alive
-0.62
agate
-0.61
buck
-0.61
Bom
-0.60
Pall
-0.59
Kard
-0.59
olition
-0.58
POSITIVE LOGITS
idiots
1.16
adjunct
0.83
insights
0.81
tools
0.80
tip
0.80
insight
0.78
fully
0.78
tool
0.76
tips
0.75
iences
0.72
Activations Density 0.048%