INDEX
Explanations
words related to acceptance and agreement
New Auto-Interp
Negative Logits
glands
-0.71
ixt
-0.70
ionage
-0.70
vertisement
-0.69
ibur
-0.69
borough
-0.67
Indust
-0.65
igil
-0.65
iferation
-0.65
borgh
-0.64
POSITIVE LOGITS
ably
0.90
ances
0.85
accepting
0.84
acceptance
0.83
responsibility
0.81
ANCE
0.78
uncond
0.78
enance
0.77
bribes
0.75
accept
0.75
Activations Density 0.670%