INDEX
Explanations
comparative phrases indicating contrasts or differences
New Auto-Interp
Negative Logits
decently
-0.53
himself
-0.52
ftagPool
-0.50
himself
-0.50
Anyways
-0.49
Итак
-0.49
же
-0.49
cherchés
-0.49
Пока
-0.48
shall
-0.48
POSITIVE LOGITS
Majefty
1.07
purpoſe
1.06
ſtate
1.06
Jefus
1.00
itſelf
0.99
uſed
0.97
uſe
0.96
pleaſure
0.96
reaſon
0.95
Efq
0.95
Activations Density 2.888%