INDEX
Explanations
instances of the word "small."
references to the concept of 'smallness' or small entities
New Auto-Interp
Negative Logits
alde
-0.70
anthem
-0.67
arov
-0.67
endas
-0.66
fet
-0.66
partying
-0.65
otle
-0.64
oÄŁ
-0.63
allegiance
-0.62
fal
-0.62
POSITIVE LOGITS
Small
3.43
Small
2.79
small
2.11
Large
1.96
Large
1.61
Tiny
1.56
small
1.53
Medium
1.51
Minor
1.39
Huge
1.33
Activations Density 0.009%