INDEX
Explanations
references to origins or beginnings
references to the concept of origins or foundations
New Auto-Interp
Negative Logits
asel
-0.71
isites
-0.70
hammad
-0.70
tein
-0.69
eers
-0.68
verages
-0.67
eor
-0.65
MRI
-0.63
ramer
-0.61
nesota
-0.61
POSITIVE LOGITS
roots
1.18
Roots
1.12
roots
1.07
waters
0.83
pring
0.82
ourcing
0.82
dale
0.78
lore
0.75
moss
0.75
pora
0.72
Activations Density 0.009%