INDEX
Explanations
the names of individuals, specifically those mentioned frequently in a conversational or narrative context
New Auto-Interp
Negative Logits
'},
-0.86
'){
-0.85
"){
-0.82
/**
-0.79
Hok
-0.77
Steiner
-0.76
=$((
-0.75
UnsafeEnabled
-0.73
Datuak
-0.73
Segal
-0.71
POSITIVE LOGITS
Dave
0.89
Dave
0.88
gebob
0.88
Bob
0.83
Bob
0.81
dave
0.73
Joe
0.72
Sue
0.72
Joe
0.71
Jim
0.68
Activations Density 0.049%