INDEX
Explanations
special characters and symbols often found in text
New Auto-Interp
Head Attr Weights
0:0.04
1:0.02
2:0.08
3:0.07
4:0.04
5:0.05
6:0.38
7:0.06
8:0.05
9:0.07
10:0.05
11:0.04
Negative Logits
shire
-1.58
ividual
-1.57
condem
-1.42
dehydration
-1.38
amusement
-1.35
pilgrimage
-1.35
defe
-1.30
Ced
-1.29
Hag
-1.27
ecause
-1.26
POSITIVE LOGITS
actionDate
1.74
untled
1.61
�
1.59
SI
1.57
UTF
1.55
icio
1.55
м
1.53
raph
1.53
л
1.51
ゼ
1.49
Activations Density 0.001%