INDEX
Explanations
mentions of specific names or terms "Ab" and its variations with related numbers
mentions of specific names or locations
New Auto-Interp
Negative Logits
pher
-0.58
Compass
-0.56
gerald
-0.55
Norton
-0.55
patrick
-0.55
glers
-0.54
dogs
-0.54
companion
-0.51
manufact
-0.51
kittens
-0.51
POSITIVE LOGITS
zeb
0.81
aze
0.72
ande
0.71
oga
0.70
ollah
0.69
abi
0.68
NAS
0.68
oya
0.67
asy
0.66
ichi
0.66
Activations Density 0.165%