INDEX
Explanations
social media hashtags or tagging indicators, such as "@" and hashtags
instances of a specific character or string representation in the text
New Auto-Interp
Negative Logits
Silk
-0.72
Whis
-0.71
Vaugh
-0.69
Nam
-0.68
Morg
-0.66
Mayo
-0.65
shroud
-0.65
spinning
-0.65
Peb
-0.64
Carbuncle
-0.64
POSITIVE LOGITS
º
1.15
¹
1.12
į
1.01
§
1.00
£
1.00
»
0.98
¬
0.97
®
0.95
¿
0.95
¡
0.95
Activations Density 0.116%