INDEX
Explanations
phrases that express actions or intentions related to trying, playing, and using tools or methods
New Auto-Interp
Negative Logits
PostInfinity
-0.47
biais
-0.33
Rohy
-0.29
تاح
-0.27
جمعیت
-0.26
Ilustra
-0.26
-0.26
Clippers
-0.26
chó
-0.26
cámara
-0.25
POSITIVE LOGITS
experimenting
1.81
experimentation
1.73
messing
1.68
experiment
1.67
Experiment
1.63
Experiment
1.63
tinkering
1.60
experiment
1.58
experimented
1.55
tinker
1.47
Activations Density 0.489%