INDEX
Explanations
references to the word "apt" and its variations
New Auto-Interp
Negative Logits
rist
-0.21
h
-0.19
r
-0.19
y
-0.17
alet
-0.16
rd
-0.16
rug
-0.16
rav
-0.15
ry
-0.15
rs
-0.15
POSITIVE LOGITS
ech
0.24
itude
0.23
omatic
0.19
ics
0.19
ITUDE
0.19
ools
0.18
orch
0.18
ocurrency
0.16
acular
0.16
iva
0.16
Activations Density 0.016%