INDEX
Explanations
foreign characters, potentially indicating specific coding or language-related patterns
symbols and punctuation in the context of text formatting or coding
New Auto-Interp
Negative Logits
Bacon
-0.96
Bag
-0.94
XI
-0.94
Pes
-0.93
Eps
-0.93
Grill
-0.92
Union
-0.92
Opinion
-0.92
Meter
-0.92
Xiao
-0.91
POSITIVE LOGITS
subject
1.56
nob
1.54
future
1.53
general
1.49
same
1.47
average
1.47
new
1.46
simple
1.46
license
1.46
super
1.46
Activations Density 0.628%