INDEX
Explanations
instances of the word "got" in various contexts
New Auto-Interp
Negative Logits
odic
-0.16
628
-0.15
ctr
-0.14
ertos
-0.14
istrat
-0.14
aire
-0.14
indsight
-0.14
oon
-0.13
inished
-0.13
hoo
-0.13
POSITIVE LOGITS
chas
0.33
cha
0.30
terdam
0.27
rid
0.22
CHA
0.22
chu
0.20
sta
0.19
ton
0.18
Ta
0.18
iate
0.18
Activations Density 0.022%