INDEX
Explanations
instances of the word "couple"
New Auto-Interp
Negative Logits
schild
-0.88
sein
-0.76
andise
-0.76
enegger
-0.68
obin
-0.66
eworld
-0.65
efe
-0.65
roup
-0.65
Directorate
-0.65
ivism
-0.64
POSITIVE LOGITS
dozen
1.38
hundred
1.30
thousand
1.07
dozen
1.00
weeks
0.85
of
0.82
ILCS
0.80
een
0.80
tablespoons
0.78
thirds
0.76
Activations Density 0.014%