INDEX
Explanations
words related to actions of allocating resources or causing movements
words related to provocation and location
New Auto-Interp
Negative Logits
mar
-0.71
test
-0.67
arat
-0.66
sonian
-0.65
character
-0.64
enez
-0.63
erer
-0.60
dred
-0.59
Answers
-0.59
mys
-0.58
POSITIVE LOGITS
entric
1.02
eus
0.89
atives
0.88
ATIVE
0.79
aukee
0.76
ocation
0.75
ptives
0.75
ISION
0.74
Franch
0.72
lear
0.72
Activations Density 0.013%