INDEX
Explanations
instances where a specific word "which" is used followed by the number 9 or 10
New Auto-Interp
Negative Logits
politics
-0.78
let
-0.75
LET
-0.68
Calling
-0.62
CLOSE
-0.62
Calling
-0.61
UGH
-0.60
dividing
-0.60
RTX
-0.59
UG
-0.59
POSITIVE LOGITS
originated
0.85
resulted
0.85
contributed
0.82
involve
0.80
lasted
0.79
derive
0.78
consisted
0.78
exceeded
0.77
specialize
0.77
yielded
0.77
Activations Density 0.022%