INDEX
Explanations
instances of the word "obvious"
the word "obvious" and its variants, indicating a focus on clear or evident statements
New Auto-Interp
Negative Logits
nan
-0.91
ingers
-0.77
iership
-0.73
borg
-0.73
rams
-0.72
rigan
-0.70
isol
-0.68
psey
-0.68
restling
-0.67
uden
-0.66
POSITIVE LOGITS
iary
1.00
obvious
0.83
Leilan
0.80
signs
0.76
contrad
0.73
resemblance
0.73
distinction
0.72
Signs
0.70
ances
0.70
\\\\\\\\
0.70
Activations Density 0.012%