INDEX
Explanations
highlights or characteristics that define a person, place, or thing
phrases that question or describe the reasons behind qualities or characteristics
New Auto-Interp
Negative Logits
aimon
-0.82
nery
-0.71
Sov
-0.69
duty
-0.67
jury
-0.67
loader
-0.66
earances
-0.65
jab
-0.63
imation
-0.62
nown
-0.61
POSITIVE LOGITS
us
0.96
them
0.95
him
0.87
me
0.85
Difference
0.78
these
0.72
humanity
0.69
difference
0.69
people
0.68
Amen
0.68
Activations Density 0.072%