INDEX
Explanations
themes related to historical context and social commentary
New Auto-Interp
Negative Logits
hopefully
-0.08
ince
-0.08
-0.07
onium
-0.07
_almost
-0.07
iamond
-0.06
/loader
-0.06
ame
-0.06
prostitu
-0.06
antis
-0.06
POSITIVE LOGITS
however
0.41
however
0.28
However
0.27
However
0.25
jedoch
0.24
HOWEVER
0.23
однако
0.23
však
0.22
aber
0.20
allerdings
0.19
Activations Density 0.800%