INDEX
Explanations
focus on operation and perception details
New Auto-Interp
Negative Logits
😟
0.50
whitish
0.50
brownish
0.49
almond
0.47
yellowish
0.46
0.46
"..
0.45
😃
0.45
ช่วย
0.44
ueness
0.44
POSITIVE LOGITS
praxis
0.54
bastard
0.52
抵达
0.51
tether
0.48
asymptotically
0.48
fetish
0.48
prophyl
0.48
fundamentally
0.48
obsol
0.48
desperately
0.47
Activations Density 0.017%