INDEX
Explanations
terms related to drinking or substances to consume
references to substance use, particularly with a focus on 'drugs'
New Auto-Interp
Negative Logits
ERN
-0.85
eers
-0.80
Tactics
-0.75
Pose
-0.72
Corpus
-0.70
eering
-0.68
Territories
-0.65
Leilan
-0.65
orkshire
-0.64
Blueprint
-0.64
POSITIVE LOGITS
inking
1.23
unk
1.14
inker
1.14
inks
1.12
agons
1.11
agnar
1.10
ink
1.01
ifted
0.95
ifts
0.94
ained
0.93
Activations Density 0.012%