INDEX
Explanations
words related to metaphors and abstract concepts
references to academic concepts, particularly those related to "aphorisms" or concise statements expressing general truths
New Auto-Interp
Negative Logits
PUT
-0.88
nings
-0.81
swick
-0.80
âĸ¬âĸ¬
-0.78
eer
-0.69
Bauer
-0.65
Adidas
-0.65
eers
-0.65
Clown
-0.65
LOAD
-0.64
POSITIVE LOGITS
ysics
1.27
ysical
1.24
obia
1.20
orically
1.11
osate
1.05
obic
1.05
urst
0.96
orical
0.96
onso
0.95
oenix
0.95
Activations Density 0.017%