INDEX
Explanations
HTML or web-related elements, specifically links and stylesheets
New Auto-Interp
Negative Logits
Италијани
-0.63
surla
-0.60
saveiro
-0.55
PeEnEo
-0.54
bufio
-0.52
EndGlobalSection
-0.52
pleaſure
-0.50
انجليز
-0.50
specifically
-0.49
kháu
-0.49
POSITIVE LOGITS
rel
0.52
min
0.44
laikā
0.39
end
0.38
ншни
0.37
ends
0.36
______
0.36
_____
0.36
Воз
0.35
____
0.35
Activations Density 0.079%