INDEX
Explanations
phrases related to the physical characteristics or attributes of objects
references to quantitative measurements or significant attributes
New Auto-Interp
Negative Logits
gently
-0.70
selves
-0.70
THR
-0.68
ELF
-0.65
odcast
-0.62
rahim
-0.60
shall
-0.60
din
-0.60
begin
-0.60
Wrong
-0.59
POSITIVE LOGITS
nature
1.06
afforded
0.99
similarities
0.98
disparity
0.93
constraints
0.92
resemblance
0.91
aspect
0.88
inherent
0.88
difference
0.88
discrepancy
0.88
Activations Density 0.377%