INDEX
Explanations
words indicating locations or spaces
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.07
3:0.08
4:0.08
5:0.08
6:0.09
7:0.08
8:0.08
9:0.08
10:0.06
11:0.09
Negative Logits
ğ
-1.75
��
-1.61
Gaul
-1.60
Samson
-1.56
Argon
-1.55
>>>>>>>>
-1.52
】
-1.49
Catalonia
-1.49
Rodriguez
-1.47
riks
-1.47
POSITIVE LOGITS
imum
2.01
conn
1.78
estate
1.74
itutional
1.70
ignt
1.66
incible
1.64
dorm
1.62
abit
1.59
istant
1.55
idate
1.53
Activations Density 0.000%