INDEX
Explanations
phrases related to objectives or intentions
New Auto-Interp
Negative Logits
Franke
-0.76
ogeneous
-0.73
Peterson
-0.66
"}}
-0.65
Schroeder
-0.65
readFileSync
-0.63
McCarty
-0.62
ibouti
-0.62
The
-0.61
Platten
-0.60
POSITIVE LOGITS
Aims
1.63
Aim
1.58
AIM
1.53
Aim
1.47
Aims
1.45
AIM
1.38
aim
1.33
aims
1.33
aim
1.25
aimed
1.24
Activations Density 0.049%