INDEX
Explanations
phrases with the pattern "âĢĻ" followed by a number as a token
instances of a particular character or symbol
New Auto-Interp
Negative Logits
imitation
-0.71
carbohyd
-0.67
arios
-0.65
raviolet
-0.64
ramid
-0.63
pyramid
-0.63
iage
-0.62
wana
-0.62
convenience
-0.62
XT
-0.61
POSITIVE LOGITS
女
1.03
Ļ
0.96
ï¸ı
0.92
İ
0.87
Ùħ
0.86
Ľ
0.85
ı
0.84
Ķ
0.82
ļ
0.82
ð
0.81
Activations Density 0.488%