INDEX
Explanations
specific special characters and phrases that include them
references or symbols related to a specific individual or character in a repeated manner
New Auto-Interp
Negative Logits
anwhile
-0.72
multiplying
-0.69
Sylvia
-0.67
assassinate
-0.66
specificity
-0.65
fulfillment
-0.64
redients
-0.64
travel
-0.63
confinement
-0.62
blacklist
-0.62
POSITIVE LOGITS
¬
1.07
º
1.05
į
1.04
ı
1.03
¹
1.00
ł
0.98
Ĵ
0.98
Ń
0.96
§
0.95
Ľ
0.93
Activations Density 0.061%