INDEX
Explanations
attends to decimal point tokens from numeric tokens
New Auto-Interp
Head Attr Weights
0:0.08
1:0.23
2:0.10
3:0.16
4:0.11
5:0.09
6:0.06
7:0.13
Negative Logits
camore
-0.43
padek
-0.42
mondi
-0.42
sauvages
-0.41
Vay
-0.40
decir
-0.40
JTable
-0.39
aéri
-0.38
multer
-0.38
Ankunft
-0.38
POSITIVE LOGITS
amaño
0.49
timmt
0.47
írus
0.45
loroethene
0.45
osť
0.44
Sandoval
0.43
TintMode
0.43
𝘂
0.43
незавершена
0.43
vensko
0.42
Activations Density 0.205%