INDEX
Explanations
variations of the word "be" and forms of the verb "to be" across different contexts
New Auto-Interp
Negative Logits
ten
-0.21
eleven
-0.17
tenth
-0.17
thousands
-0.17
eighth
-0.16
Hundreds
-0.15
inth
-0.15
millions
-0.15
ninth
-0.15
twelve
-0.15
POSITIVE LOGITS
50
0.69
60
0.69
65
0.67
55
0.65
70
0.64
52
0.63
62
0.62
56
0.62
54
0.61
40
0.61
Activations Density 0.450%