INDEX
Explanations
attends to tokens describing positive qualities from tokens indicating negative aspects or conditions
New Auto-Interp
Head Attr Weights
0:0.12
1:0.08
2:0.09
3:0.49
4:0.05
5:0.02
6:0.05
7:0.07
Negative Logits
ſtate
-0.45
AndroidJUnit
-0.44
Efq
-0.40
ImageContext
-0.39
ViewImports
-0.39
houſe
-0.38
perſon
-0.38
greateſt
-0.37
purpoſe
-0.36
Theſe
-0.36
POSITIVE LOGITS
betweenstory
0.32
détaillée
0.31
AccessorTable
0.30
فريبيس
0.28
departments
0.28
procédure
0.27
OrUpdate
0.27
diminta
0.27
CWE
0.27
boton
0.26
Activations Density 0.603%