INDEX
Explanations
phrases that convey opinions or evaluative statements
Followed by a preposition or "also"
end of clause or phrase
New Auto-Interp
Negative Logits
houſe
-0.88
purpoſe
-0.82
Jefus
-0.82
ſche
-0.82
ſtate
-0.81
ſelf
-0.80
becauſe
-0.79
pleaſure
-0.78
ſhe
-0.78
chofe
-0.78
POSITIVE LOGITS
">*
0.61
"}
0.59
;">
0.59
")");
0.57
lihatkan
0.57
%;
0.55
`]
0.54
"}")
0.53
be
0.53
"),
0.52
Activations Density 0.245%