INDEX
Explanations
proper nouns, specifically names of people and places
occurrences of the word "ass" and its variations
New Auto-Interp
Negative Logits
icum
-0.80
istically
-0.78
spring
-0.71
ODUCT
-0.70
amination
-0.69
ization
-0.69
icient
-0.68
ãĤ¨
-0.68
icial
-0.66
opez
-0.66
POSITIVE LOGITS
uve
0.85
erness
0.83
ment
0.82
ments
0.80
asse
0.76
xual
0.74
MENTS
0.73
daq
0.73
alam
0.72
mble
0.71
Activations Density 0.027%