INDEX
Explanations
instances of the word "got"
New Auto-Interp
Negative Logits
itſelf
-0.95
pleaſure
-0.92
purpoſe
-0.83
ſtate
-0.81
perſon
-0.81
houſe
-0.80
himſelf
-0.77
leſs
-0.75
Perſ
-0.73
ſelves
-0.73
POSITIVE LOGITS
got
1.69
got
1.46
Got
1.45
Got
1.39
GOT
1.21
gotta
1.08
GOT
1.02
Gotta
0.98
Gotta
0.85
Gotcha
0.81
Activations Density 0.038%