INDEX
Explanations
unusual characters or symbols from the text
instances of a specific repeated character or symbol
New Auto-Interp
Negative Logits
Tanz
-0.75
captives
-0.69
dissidents
-0.65
Gaul
-0.64
Canadians
-0.61
vulner
-0.59
Palestin
-0.59
Lowell
-0.58
Stuff
-0.58
Mobil
-0.58
POSITIVE LOGITS
ï¸ı
0.85
ï¸
0.81
own
0.80
Pg
0.75
uable
0.75
should
0.73
ished
0.73
forth
0.71
ved
0.71
imgur
0.70
Activations Density 0.177%