INDEX
Explanations
terms associated with habitual actions or expectations
like "also", "still", and "then"
also / then / still / only followed by a word
New Auto-Interp
Negative Logits
Jefus
-0.87
Theſe
-0.87
Chriftian
-0.79
Efq
-0.79
becauſe
-0.76
pleaſure
-0.69
ftate
-0.68
Diſ
-0.67
purpoſe
-0.67
Eſ
-0.66
POSITIVE LOGITS
also
1.07
not
1.03
likely
1.01
going
0.95
always
0.95
still
0.92
actually
0.88
able
0.87
just
0.85
considered
0.84
Activations Density 0.289%