INDEX
Explanations
instances of text containing specific symbols or characters
repeated special characters or symbols
New Auto-Interp
Negative Logits
raints
-0.76
hare
-0.73
matic
-0.67
similarities
-0.66
gist
-0.63
orial
-0.63
primates
-0.63
tone
-0.62
partnerships
-0.61
suit
-0.60
POSITIVE LOGITS
âĶĢâĶĢ
1.17
ľ
1.11
ĸ
0.93
ļ
0.93
ª
0.93
âĶĢâĶĢâĶĢâĶĢ
0.92
Ĺ
0.90
¼
0.90
¸
0.89
Ĭ
0.86
Activations Density 0.256%