INDEX
Explanations
foreign characters from specific languages, such as Serbian and Italian
exclamation marks and special characters
New Auto-Interp
Negative Logits
ĸļ
-0.78
selage
-0.77
ortmund
-0.77
unciation
-0.77
sonian
-0.76
atis
-0.75
abase
-0.74
Gutenberg
-0.74
orno
-0.73
hof
-0.73
POSITIVE LOGITS
dating
1.04
ï¸ı
0.92
coming
0.83
ban
0.82
ward
0.79
LOAD
0.78
stairs
0.77
dates
0.75
lishes
0.74
mit
0.72
Activations Density 0.007%