INDEX
Explanations
references to personal connections and community engagement
New Auto-Interp
Negative Logits
ivor
-0.18
Insider
-0.16
aks
-0.16
jal
-0.15
-ahead
-0.15
ziel
-0.14
dar
-0.14
okit
-0.14
ritch
-0.14
pri
-0.14
POSITIVE LOGITS
into
0.28
closer
0.28
together
0.24
alive
0.23
Clo
0.21
Into
0.21
into
0.21
alive
0.21
Into
0.21
back
0.21
Activations Density 0.032%