INDEX
Explanations
verbs related to assistance and effort
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.08
3:0.11
4:0.11
5:0.03
6:0.12
7:0.24
8:0.04
9:0.04
10:0.08
11:0.05
Negative Logits
20439
-1.83
acron
-1.76
onyms
-1.58
pse
-1.46
netflix
-1.46
Mahm
-1.41
subtitles
-1.38
uploads
-1.36
Moff
-1.36
requested
-1.36
POSITIVE LOGITS
arena
1.65
opolis
1.60
ideon
1.60
hedral
1.57
arenas
1.49
sburgh
1.48
leground
1.46
overrun
1.44
ooth
1.41
illet
1.40
Activations Density 0.001%