INDEX
Explanations
words or phrases that introduce a clarifying or explanatory statement.
New Auto-Interp
Negative Logits
condensation
-0.88
dew
-0.85
droplets
-0.68
condens
-0.63
fog
-0.60
sweat
-0.59
kond
-0.59
sweat
-0.59
sweating
-0.58
moisture
-0.58
POSITIVE LOGITS
let
1.16
allow
1.05
Allow
1.00
Allow
0.96
allow
0.89
lemme
0.88
permit
0.85
Let
0.84
ALLOW
0.81
Let
0.80
Activations Density 1.458%