INDEX
Explanations
instances of the phrase "to be" in various contexts
New Auto-Interp
Negative Logits
Trial
-0.15
åı°
-0.15
èĩº
-0.14
isz
-0.14
rial
-0.14
atern
-0.13
lash
-0.13
dél
-0.13
imentary
-0.13
525
-0.13
POSITIVE LOGITS
honest
0.51
Honest
0.43
frank
0.40
fair
0.40
honesty
0.37
candid
0.34
fair
0.30
Frank
0.28
perfectly
0.28
truthful
0.26
Activations Density 0.023%