INDEX
Explanations
the frequency of the word "Johnson" in the text
New Auto-Interp
Negative Logits
es
-0.73
Sche
-0.69
Cale
-0.68
times
-0.66
ü
-0.66
scenes
-0.65
клопе
-0.65
lou
-0.64
Vla
-0.64
Beres
-0.64
POSITIVE LOGITS
Johnson
1.92
johnson
1.78
JOHNSON
1.77
Johnson
1.77
NSON
1.70
johnson
1.55
JOH
1.21
Jonson
1.06
ValueStyle
1.04
nson
1.02
Activations Density 0.004%