INDEX
Explanations
mentions of dirt
references to dirt and its various contexts or implications
New Auto-Interp
Negative Logits
uberty
-0.74
itals
-0.73
eer
-0.72
faculties
-0.70
ofi
-0.69
Osw
-0.69
FontSize
-0.68
ommod
-0.67
aii
-0.66
inel
-0.65
POSITIVE LOGITS
bag
1.29
bags
1.28
dirt
1.07
bike
1.02
iest
0.95
Dirt
0.93
shed
0.87
brush
0.82
iless
0.80
sie
0.80
Activations Density 0.010%