INDEX
Explanations
phrases indicating the presence of a strong subject or character involved in actions or discussions
New Auto-Interp
Negative Logits
strap
-0.17
.sap
-0.16
@update
-0.15
argas
-0.14
ifer
-0.14
amarin
-0.14
ilha
-0.14
@show
-0.14
fark
-0.13
Academ
-0.13
POSITIVE LOGITS
anner
0.15
iglia
0.14
Hann
0.14
↵
0.14
esis
0.14
cr
0.13
ergy
0.13
Bord
0.13
_defaults
0.13
ANTI
0.13
Activations Density 0.107%