INDEX
Explanations
instances of the word "reason"
New Auto-Interp
Negative Logits
multi
-0.46
FMC
-0.43
ingual
-0.42
Multi
-0.41
at
-0.41
Boutique
-0.41
offshore
-0.40
fs
-0.40
Inoue
-0.40
boutique
-0.40
POSITIVE LOGITS
reason
2.08
reason
1.90
Reason
1.88
Reason
1.82
REASON
1.69
reasons
1.66
REASON
1.64
Reasons
1.55
Reasons
1.51
reasons
1.50
Activations Density 0.016%