INDEX
Negative Logits
README
-0.07
undred
-0.07
olders
-0.07
onestly
-0.06
καθ
-0.06
playful
-0.06
sexual
-0.06
ollywood
-0.06
/non
-0.06
party
-0.06
POSITIVE LOGITS
H
0.11
H
0.11
Harrison
0.10
haul
0.10
.H
0.10
HM
0.09
HB
0.09
-h
0.09
Ho
0.09
Hale
0.09
Activations Density 0.491%