INDEX
Explanations
punctuation marks and special characters
New Auto-Interp
Head Attr Weights
0:0.06
1:0.19
2:0.06
3:0.06
4:0.05
5:0.16
6:0.10
7:0.04
8:0.08
9:0.06
10:0.06
11:0.04
Negative Logits
accidents
-1.67
idan
-1.64
ila
-1.54
anship
-1.52
shelters
-1.52
stress
-1.51
urden
-1.49
abor
-1.48
licences
-1.46
pressures
-1.46
POSITIVE LOGITS
�
2.02
ْ
1.89
Pixel
1.87
(\
1.83
toc
1.80
Rove
1.69
@
1.65
Buy
1.64
(_
1.63
Icar
1.61
Activations Density 0.001%