INDEX
Explanations
instances of the word "two" in various contexts
New Auto-Interp
Negative Logits
renheit
-0.87
Ô
-0.82
ugu
-0.80
schild
-0.79
za
-0.77
agher
-0.74
ategory
-0.74
ovi
-0.73
gallery
-0.72
nect
-0.72
POSITIVE LOGITS
halves
1.51
sides
1.26
thirds
1.18
sexes
1.11
fold
1.04
Kore
1.01
parties
0.95
extremes
0.95
dozen
0.88
brothers
0.85
Activations Density 0.041%