INDEX
Explanations
attends to tokens denoting specific grammatical relationships from tokens that follow them
New Auto-Interp
Head Attr Weights
0:0.12
1:0.14
2:0.10
3:0.09
4:0.14
5:0.15
6:0.10
7:0.12
Negative Logits
ſtate
-0.38
ftate
-0.36
referenties
-0.35
myſelf
-0.35
themſelves
-0.35
Houſe
-0.34
pleaſure
-0.33
المناصب
-0.33
uſe
-0.32
purpoſe
-0.32
POSITIVE LOGITS
getMenuInflater
0.26
ConstraintMaker
0.26
createCell
0.25
kamers
0.25
ArgumentParser
0.23
PreferredItem
0.23
SpringBootTest
0.23
جة
0.23
nœ
0.22
mazaki
0.22
Activations Density 0.034%