INDEX
Explanations
terms related to identification and categorization of entities and their relationships
New Auto-Interp
Negative Logits
dır
-0.83
d
-0.73
dress
-0.71
down
-0.66
governmental
-0.65
ders
-0.64
dog
-0.64
da
-0.63
daten
-0.63
durch
-0.63
POSITIVE LOGITS
nnnn
0.80
ning
0.79
ned
0.74
nnn
0.74
ized
0.69
ization
0.67
NNNN
0.66
ize
0.66
izing
0.64
netje
0.64
Activations Density 1.673%